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Preface 


This volume is neither just for the specialist nor just for the novice. Rather, it 
should appeal to anyone sufficiently interested in psychology to have obtained a 
sense of its current history. This book is not simply a “state of the art” book on 
recent psychological research, although a wide range of fundamental research 
topics is carefully discussed—such topics as perception, action, memory, develop- 
ment, esthetics, language, and speech. Nor do the essays contained in this 
volume merely reflect a “business as usual” attitude about contemporary psy- 
chological theory. Through these essays the authors express a collective attitude 
that a careful scrutiny of the fundamental tenets of contemporary psychology 
may be needed. In some essays specific faults in the foundations of an area are 
discussed, and suggestions are made for remedying them, In other essays the 
authors flirt with more radical solutions, namely, beginning from new founda- 
tions altogether. 

Although the authors do not present a monolithic viewpoint, a careful reading 
of all their essays under one cover reveals a glimpse of a new framework by 
which theory and research may be guided. Hence, the subtitle of the book, 
Toward an Ecological Psychology, distills the essence of their tacit agreement, 
namely, a deeply felt commitment that a full appreciation of what the animal’s 
or human’s world is like provides an indispensable context of constraint for 
understanding how information about such a world may be processed. Here a 
consensus obtains that justifies the book as it did the holding of the conference 
from which the book was engendered. The success of our joint efforts repre- 
sented by this volume may well be likened to the proverbial blind men who 
attempt to ascertain the nature of a large and complicated animal from multi- 
farious and necessarily limited viewpoints: There is no argument as to the 
existence of the beast but only over what form it might ultimately assume. 
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viii PREFACE 


The chapters in this volume derive from a conference on Perceiving, Acting, 
and Knowing held by the Center for Research in Human Learning at the 
University of Minnesota, July 23-August 17, 1973. In order to better under- 
stand reasons for including the present chapters, however, it is necessary to 
consider happenings whose history extends even farther back in time. 

In one way or another, most of the contributors to this volume have been 
associated with the Center for Research in Human Learning at the University of 
Minnesota. Many contributors were graduate students at the center (Bransford, 
Franks, Mace, McCarrell, Pittenger, Strange, Verbrugge, Weimer); others were 
postdoctoral associates (Pufall, Shankweiler, Shaw). Still others had been visiting 
professors or lecturers whose ideas played important roles in directing thinking 
at the center (Cunningham, Gibson, Pribram). Three contributors were faculty 
members of the center (Jenkins, Maratsos, Shaw) and one contributor (Nitsch) 
was a student (and frequently a teacher) of previous center students. Only 
Hester, Loftus, and Turvey remain. Hester was a graduate school colleague of 
Shaw at Vanderbilt, so we became aware of his interest in art and perception. 
The conference provided an excellent opportunity to hear his ideas. Loftus was 
one of the pioneers in the newly budding field of semantic memory research, 
and the conference provided an ideal vehicle for hearing her. Turvey’s work on 
action and perception was extremely relevant, so it seemed almost inconceivable 
to hold the conference without him. 

The spark that initiated the Minnesota conference was the Pennsylvania State 
University Conference on Cognition and the Symbolic Processes (see Weimer & 
Palermo, 1974). Participants at that conference explored some highly exciting 
ideas and the many participants expressed the desire to explore them in greater 
detail. The present editors therefore proposed the Minnesota Conference and 
view it as an outgrowth of the Pennsylvania State University Conference. Both 
conferences were oriented toward the development of an Ecological Psychology, 
a theme that is further explored in the introductory chapter to this volume. 

We are indebted to many people for their contributions to the conference and 
the present volume. Our primary debt is to the faculty and staff at the Center 
for Research in Human Learning for helping the conference run so smoothly. 
Kathy Casey deserves a special note of thanks in this regard. The funds for the 
conference was provided by NICHD Grant HD-00098, and we are extremely 
grateful for this. 

Jerry Wald did a great deal of the editing. We cannot thank him enough for his 
outstanding work. Bill Mace also contributed his highly skilled expertise to the 
editing process, and we are extremely grateful to him. Thanks are also due to 
Lawrence Erlbaum and his staff for their skillful and enthusiastic support while 
preparing this book. 

Much of the editing was done while Robert Shaw was a fellow at the Center 
for Advanced Study in the Behavioral Sciences at Stanford. This support is most 
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gratefully acknowledged. John Bransford is also grateful for support from the 
National Institute of Education (NE-6-00-3-0026) while working on the book. 

Our greatest thanks extends to James J. Jenkins for his leadership during the 
first ten years of the Center for Research in Human Learning. Through his 
guidance and enthusiasm, the center became an exciting environment for explor- 
ing new directions and approaches, We are extremely pleased to dedicate this 
volume to him. 
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Introduction: Psychological 
Approaches to the Problem 
of Knowledge 


Robert Shaw 
University of Connecticut 


John Bransford 
Vanderbilt University 


The major issue addressed by the authors of this volume concerns the problem 
of how we obtain and act upon knowledge about the world. Obviously, no single 
nor simple answer to such a complex problem can be given. Yet the problem of 
knowledge has been the central focus of psychological theories since the incep- 
tion of the field as an experimental science over a century ago. Act psychology, 
structuralism, functionalism, radical behaviorism, Gestalt psychology, neobehav- 
iorism, and, finally, information-processing theories have all attempted answers 
to this problem—with relative degrees of success. 

The authors in the present volume offer hypotheses, speculations and tentative 
solutions to aspects of the problem of knowledge. Since they present a diversity 
of ideas, it seems helpful to read the various chapters with some guiding 
questions in mind. The purpose of this chapter is to provide a few such questions 
that the reader might add to his or her own. 

Among the many significant questions raised in this volume there is one of 
paramount concern to us all, namely, whether the traditional approach to 
psychological theory and research which emphasizes the how of psychological 
processing might not need supplementation by an alternative approach that 
emphasizes the what of psychological processing. If we conceive of humans and 
animals as information processing systems, then perhaps it would be wise to ask 
not only how such information is processed but also what the nature of such 
information might be. One answer, and the one we wish to examine in this 
chapter, runs like a thread throughout the fabric of this volume, tying together 
the variety of ideas into a coordinated patchwork: What is the nature of the 
epistemic process by which people know their world and the nature of the world 


that it might be known? 
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We call the common thread of interest in this question the ecological attitude— 
an attitude toward theory and research that fully appreciates the world as the 
source of information by which animals and humans perceive events, compre- 
hend circumstances, and act successfully in the service of biological, psychologi- 
cal, and social needs. To the extent, however, that there exists dissension among 
the authors of the subsequent chapters as to what constitutes an adequate 
answer to the what-question posed by the ecological approach, to that extent 
the issues separating traditional information-processing approaches from the 
ecological approach are sharply dramatized. To aid the reader in an appreciation 
of these issues we turn now to a discussion of the assumptions of each approach. 


PART ONE: THE ECOLOGICAL APPROACH TO PSYCHOLOGY 


The Need for a New Approach 


In the attempt to develop new approaches to old problems, inevitably questions 
arise regarding the unanalyzed and, therefore, unchallenged presuppositions 
tacitly or explicitly accepted by the field. To establish a new brand of psycho- 
logical theory necessarily requires that several fundamental tenets of traditional, 
contemporary psychology be challenged and, if possible, overthrown to make 
way for the new approach. However, before presenting the tenets of the 
ecological approach to the problem of knowledge, it will be useful to consider 
some of the major assumptions and hypotheses that underlie current cognitive 
psychology, since this is the area of psychology which has assumed responsibility 
for solving the problem of knowledge. 


The man-machine analogy. Current cognitive theory may best be charac- 
terized as the information-processing approach—an approach that feeds upon the 
man-machine, or brain-computer, analogy. This analogy has been the staple of 
most psychological theorists’ diets for the last decade or so. 

The information-processing approach, more than any previous one, seems to 
recognize the need for precision when characterizing psychological processes. In 
their search for ways to construct models that explain how knowledge might be 
represented, stored, and retrieved from memory, it was natural for psychologists 
to turn for help to the fast developing field of computer science, once the 
promise of behaviorism was seen to be overblown. After all, what seemed to be 
analogical problems were being explicitly addressed by computer scientists in 
their attempts to understand complex information-processing devices. How 
natural it was to extend the concept of computer to include the psychological 
machinery of humans in order to see if analogs of the information-processing 
solutions might not be found to hold there as well. Thus, there is no great 
mystery in explaining how current cognitive theory came to tely so heavily on 
computer science concepts and its modeling techniques. For many cognitive 
theorists their prayers for precision seemed to have been answered. 
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There should be little quarrel with the contention that this metaphor has in 
many ways served our field well, for it has helped to free many psychologists 
from the excessive restrictions of radical behaviorism. Similarly, this mechanistic 
metaphor has reinstated a serious and worthwhile concern for understanding 
how knowledge is represented and organized within humans and animals as 
self-organizing, knowledge-gathering systems. Moreover, the acceptance of the 
analogy of cognitive structures to computational structures (for example, net- 
works, hierarchies, or heterarchies of control functions) has the additional virtue 
of fostering an appreciation for exactitude in theory that ultimately demands 
mathematically rigorous models (Cunningham, this volume). All this is surely an 
improvement over the simplistic serial processing models offered by stimulus- 
response theory over the preceding decades. 

Yet we psychologists can not be content to rest on our theoretical laurels, 
especially when garnered so easily from the successes of another field. It would 
be both unfair and unwise to expect the solutions given by computer scientists 
to their problems to also qualify mutatis mutandi as solutions to our own 
problems, since serious questions can be raised whether or not the two sets of 
problems are even abstractly the same (Shaw & McIntyre, 1974). 

It is wise to remember that computer scientists’ systems are artificially con- 
trived while ours are naturally evolved; their systems are passive while ours are 
active; their systems are purposeless (except in a second-hand way) while ours 
pursue primary goals of self-survival and adaptation; their systems, as complex as 
they are, are still astronomically simpler than those psychologists must under- 
stand. And what of the role of emotional, personality, and social factors in 
determining the questions appropriate to humans but as yet undefined for 
artificial systems? Indeed, to be realistic, we should ask who should be the tutor 
and who the tutored? Consequently, it is not unreasonable to suggest that at this 
juncture in the history of these two young sciences the most relevant concepts 
computer science may offer psychology are still in the offing. Thus, not 
pessimism but prudence counsels psychologists to assume a wait-and-see attitude 
before pronouncing sentence upon the man-machine analogy. Indeed, the pro- 
found question of what constitutes a machine hangs in the balance over and 
against the question of what is a human. The answer may surprise us all. 

On the other hand, there is the danger that premature application of the 
man-machine analogy to psychology, to the exclusion of other approaches, may 
unnecessarily restrict our thinking about problems whose solutions demand 
greater theoretical flexibility than strict adherence to the analogy allows. For 
instance, as psychologists, there is a natural and understandable tendency to 
ignore certain complications of our phenomena for the sake of expediency in 
either doing experiments or constructing models. Rarely do we clearly enunciate 
any method by which theoretical principles designed to explain animal and 
human behavior in the laboratory might be generalized back to the natural 
contexts in which the behaviors originated. We tend to forget that humans and 
animals are active, investigatory creatures driven by definite intents through a 
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complex, changing environment replete with meaning at a variety of levels of 
analysis. 

Thus we feel no tinge of theoretical compunction in blithely comparing such 
active, knowledge-seeking beings with unconscious, static machines that lack a 
wit of natural motivation. Unlike humans and animals, who perceptually mine 
the world for information on a need-to-know basis, artificial systems can only 
soak up information passively by being spoon-fed batches of alpha-numeric 
characters that have been conceptually predigested by human programmers. In 
sharp contrast to natural information, such symbolic information is devoid of 
any natural meaning for the machine and, therefore, can play no role in either its 
adaptation or survival in a relatively dust-free, humidity controlled, air-condi- 
tioned environment. In such a sterile model for man, perceiving becomes a 
passive process and knowing a purposeless one, and as for action (that is, 
purposive behavior), it remains nonexistent. 

Of course, it would be grossly unfair to criticize the computer science area for 
the rather myopic view many psychologists have taken of such problems, since 
too often the best concepts that computer science offers are much better than 
what psychologists elect to borrow. The theory of locomoting, special-purpose 
computers (robots), as well as current attempts by cyberneticists to devise 
prosthetic devices based on computer components (for example, artificial limbs 
and helmets with video inputs to the tactile system to aid the blind) are among 
the avant garde projects which may inspire new life in the nearly dead man-ma- 
chine metaphor, and thereby open fresh vistas for psychological theorists. Again 
it is wiser to counsel patience than pessimism regarding the ultimate worth of 
the metaphorical extension of the techniques of either field to the province of 
the other. Rather let us critically scrutinize the current coinage of the metaphor 
so as to assay what pragmatists would call its “cash value” as a working 
hypothesis. 

Some light might be shed on the nature of the quest for alternatives to the 
man-machine analogy if a sharp line of delineation is drawn around those 
epistemological questions explicitly asked, or tacitly assumed by the field, and 
those equally significant questions given short shrift. Typically, the questions 
given short shrift are those concerning the nature and origins of the information 
processed; how ecologically significant information is selected; and how its value 
to the animal or human might be determined. The latter questions focus on what 
is processed whereas current theory focuses on the how of processing. Attempts 
to reduce questions regarding what is processed to questions regarding how what 
is processed is processed are doomed to failure because the two questions fail to 
address the same issue. 

More specifically, the man—machine analogy is overdrawn when human percep- 
tion is likened to the decoding process by which computers compile data, and 
human knowledge to the concept of retrieval of information from storage. The 
man-machine analogy becomes a hindrance rather than an aid to psychological 
theory when it derails our thinking about how living creatures gather and act 
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upon knowledge in dynamic natural contexts. Such questions can in no way be 
reduced to questions of how information is represented, stored, or retrieved 
from storage by static devices in artificially controlled environments. Rather the 
question should be how information “stored” in the world can be perceptually 
extracted by active, investigative creatures in their relatively successful effort to 
survive calamity and achieve well-being. 

If the ecological approach merits any serious consideration as an alternative 
approach to psychological theory, then it might best be revealed by contrasting 
this approach with the more popular one of information processing psychology. 
In fairness, however, it should be remembered that although the ecological 
approach may at this time be only programmatic, in the sense that it offers 
embryonic theory sketches in the place of fully adequate theories of the 
knowledge-gathering activities of humans and animals, the information-pro- 
cessing approach is no less programmatic. So far it too has failed to offer 
adequate solutions to any of the major problems of psychology (for example, 
perception, memory, attention, action, language, motivation, emotion, per- 
sonality). The timetable for success of any approach cannot be predicted, 
although the probability of success might be, if we but consider which of the 
approaches poses the significant questions most sensibly. The judicious alloca- 
tion of the resources of our field demands that we avoid, if we can, the vain 
pursuit of will-o-the-wisp questions. 

Let us not, however, delude ourselves: Our field is much too young, our 
theories too sketchy, the mettle of our techniques too untried, to condemn with 
certitude any approach without fair trial. Consequently, the purpose of this 
introductory chapter (and the whole book for that matter) is not to reject out of 
hand any existing program of research, but to make way for an alternative 
program—the ecological approach. 

In the next sections of this chapter, we address several fundamental issues of 
psychological theory from the vantage point of the ecological approach, ques- 
tions such as: What is the nature of knowing? Is perception direct or mediated? 
What is the role of inference and memory in perceiving the true properties of the 
world? And, finally, what are the recalcitrant issues that most sharply distinguish 
current information processing approaches to psychological theory from the 
alternative ecological approach adopted by many of the authors (although not 
all) of this volume. Since the last question is the guiding thread of this essay, it 
will be discussed most fully. 


The Ecological and Information-Processing 
Approaches Compared 


A wise adage asserts that man proposes but nature disposes! This aphorism 
elegantly expresses the fundamental tenet of the ecological approach—that the 
nature of humans is inextricably intertwined with the nature of a world in which 
they live, move, and have their being. In short, the ecological attitude is founded 


6 ROBERT SHAW AND JOHN BRANSFORD 


on the fundamental belief that man is indeed the mirror of nature. This view 
does not suggest, however, that psychological phenomena may be reduced to 
physical ones, but that physical phenomena be interpreted in the light of their 
relevance to psychological phenomena—an attempt, if you will, to “ecologize” 
physics so as to understand better how the world provides a habitat for humans 
as knowing-agents, and not just as physical or biological objects. Thus the 
ecological attitude germinates in the minds of theorists who come to the stark 
realization that humans, as must all living things that survive, depend upon their 
natural and cultural environments for knowledge as well as victuals. For such 
theorists it is impossible to accept any longer models for man whose only virtue 
is that they were contrived to function efficiently in artificial contexts. 

The environment in which animals and humans have evolved, and upon which 
they depend for their well-being can be construed from both the material and 
functional viewpoints: The material environment of a species provides resources 
in the form of “hard” goods, such as air, water, food, shelter, mates and tools, 
while the functional environment provides resources in the form of “soft” 
goods, such as the requisite knowledge-gathering experiences for planning and 
executing adaptive actions. By adaptive actions we mean, rather redundantly, 
those perceptual and behavioral activities by which the hard or soft goods 
needed for survival and well-being are secured. Taken together, the sum total of 
the material and functional aspects of the general environment required by a 
particular species in order to achieve optimal adaptation, that is, the requisite 
hard and soft goods, defines an ecological niche for that species. Hence the 
ecological niche for an animal is not where it lives (its habitat) but how it lives. 
(See Gibson, this volume.) 

The ecological approach calls for nothing less than a complete understanding 
of the complex and everchanging relationship of person-as-knower to the en- 
vironment-as-known. Hence a complete understanding of the informational 
aspects of the ecological niche, as well as the behavioral consequences of such 
information, is of paramount concern to the ecological approach, regardless of 
the animal under study—humans not excluded. But, as we shall see, the concept 
of ecologically significant information (Mace, this volume; Shaw, McIntyre, & 
Mace, 1974) is as radically different from the concept of information used in 
communication theory, computer science, or some psychological theories cur- 
rently in vogue, as the concept of ecologically significant action patterns is 
radically different from the behaviorists’ concept of unintentional animal move- 
ments (Turvey, Weimer, this volume). 
$ It is natural, however, when attempting to clarify a new approach to problems 
in an area, that the old way of asking the questions come under fire. The subtitle 
of the next chapter by William Mace is a dramatic case in point: Ask not what’s 
inside your head but what your head is inside of. This way of putting the issue 
of how best to represent knowledge speaks boldly in favor of the ecological 
attitude, namely, that as theorists we not lose sight of our major phenomenon: 
people and animals are knowing-agents who live and learn within a broader 
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context of epistemic constraints than subjective theories of knowledge would 
have us believe. (It is ironic, given our behaviorist heritage, that the pendulum 
has swung so far back in the other direction that most contemporary cognitive 
theories tend to be subjective in just this way.) 

Mace’s restatement reminds one of Ryle’s (1949) wise admonition, too often 
unheeded by psychologists, to beware of careless usage of spatial metaphors 
when addressing the question of how knowledge accrues. Perception, even 
analogically speaking, is not a process by which the senses, like some itinerant 
mailman, collects coded messages about world facts and tosses them into the 
mailbox of the mind to be deciphered, sorted, and stored in memorial pigeon- 
holes by some mysterious little postal clerk and perused by him at some later 
time. 

Such a view needlessly dichotomizes the act of perceiving from the act of 
knowing just as it does the act of knowing from the act of remembering. One 
might ask, if we glibly treat these processes as separate, as many information- 
processing theories tend to do, if there is any real hope of getting our Humpty 
Dumpty theory back together again. (Bransford, McCarrell, Franks & Nitsch, 
Jenkins; this volume). 

Even worse, by positing a functional sequence of information-processing stages 
that suggest a spatial and temporal arrangement (for example, iconic store, 
short-term memory, long-term memory), such theories become unavoidably 
muddled when asked to indicate the stage at which, or by which, a percept is 
experienced, a memory recalled, a concept understood, language comprehended, 
a problem solved, or an action planned. (See Jenkins, Jiménez-Pabdn, Shaw, & 
Sefer, 1974, for an alternative approach.) Notice the problems the sequential 
stage model is heir to: Who does the experiencing in this model? Must we posit 
some little man who views the output of the final stage of the processing 
sequence? If we do, what is to prevent us from asking for still another theory to 
explain his experience? Would we not then be forced to posit within his tiny 
head a more miniscule sequence of processing stages complete with its own tiny 
observer to witness its final output? And then, within the cranium of this 
Lilliputian observer, are we not forced to posit still a more miniscule sequence 
with its diminutive observer, ad infinitum? 

How do we stop passing the seat of the experience, like the proverbial buck, 
from the highest level of processing ever inward through a potentially infinite set 
of nested sequences of processing stages? Once begun, by what principle might 
the regress be terminated? No one has yet suggested a cogent way of getting such 
theories off the slippery slide of a regress through an infinite phalanx of 
homunculi, once the chaining and nesting of processing stages has been assumed. 
However, since the homunculus hypothesis seems to enjoy an atavistic claim on 
each generation of theorizing, let us pause to consider it in more detail. 


The homunculus hypothesis. Both Attneave (1961) and Bullock (1961) 
freely endorse a hierarchical notion of the homunculus hypothesis which they 
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believe to be nonregressive and nonghostly. However, neither view offers a 
coherent theory of how such homunculi concepts can be entertained without 
first solving the semantic categories problem, that is, the problem of providing a 
finite representation of the potentially infinite set of behaviors and perceptions 
available to animals and humans. In short, it is the problem of how to provide 
descriptions of what is perceived, or what constitutes acts, that is sufficiently 
accurate to be empirically true and sufficiently economical to be theoretically 
manageable. 

Bullock (1961) quite simply identifies the experiences underlying our knowl- 
edge of perceptions and actions with the converging firing patterns of neurons. 
Thus, he unequivocally endorses a strong version of what philosophers call the 
identity solution to the mind-body question. Although he believes this hier- 
archical model avoids the infinite regress problem for homunculi, since the 
number of neural levels posited is finite, it opens up a regress of a different 
sort—a regress due to the unrestricted differentiation of semantic categories. 
Since Bullock’s hierarchical approach is a very popular one in contemporary 
psychology, let us see where such theories go astray. 

Bullock suggests that for any given behavior there must ultimately be at least 
one neuron that “decides” whether to initiate the behavior or not. Since rational 
behaviors are nearly always decided on the basis of current perceptual experi- 
ences, we must assume that this ultimate neuron (or, at least the collection of 
“cardinal” neurons feeding it) is also the seat of such experiences. Consequently, 
there must be as many neural homunculi of this sort as there are coherent 
perceptions or behaviors. The crucial question that must be resolved if this 
theoretical scheme is going to work at all is what counts as a coherent pattern of 
experience. If the potential number of such experiences is left unrestricted, in 
principle, then, in violation of Occam’s wise admonition, the theory sinks into 
the mire of speculation carried under by the weight of too many imponderables. 

But surely the speaking, writing, or reading of sentences qualifies as a coherent 
experience, as does the recognizing of objects and events, or the solving of 
problems—a myriad of which confront us day-by-day. Moreover, since the 
collection of such sets of experiences is astronomically large, with each set 
comprised of a potentially infinite membership, it is difficult to see how 
pandemonium-like, hierarchical schemes help at all. 

By postulating neural homunculi that correspond in a one-to-one fashion with 
such experiences, Bullock does not avoid a regress into a potentially infinite 
hierarchy of conscious entities. The fallacy is in believing that the finite number 
of candidate neurons, which materially restrict the number of causal supports 
for experiences, in any way functionally restricts the variety of experiences 
possible. What is needed for such theoretic schemes to be useful is a principled 
restriction on the categories of experiences available—a careful delimitation of 
what is known. 

Clearly, as Bullock himself argues, there seems to exist considerable redun- 
dancy in the neural support for experience; the mapping of categories of 
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experience onto neural functions as demanded by the mind-body indentity 
hypothesis might just as likely be one-to-many as many-to-one. Therefore, if 
one-to-many, that is, if there are more potential neural firing patterns than 
meaningful experiences, given the finiteness and mortality of man, then the 
problem is how the vast inventory of such functional supports for experiences 
might be realistically restricted. Thus, theories like Bullocks do not suffer from 
being too weak but from being too powerful—an embarrassment of riches rather 
than a paucity of concepts. 

The crucial question then is how the taxonomy of significant experiences can 
be realistically restricted, rather than how such experiences might be func- 
tionally supported once they are clearly defined. The ecological approach to 
psychology, with its concept of ecologically significant information, purports to 
offer a methodological tool for trimming the fat from such theories of how 
processing occurs. (Gibson, Mace, this volume.) 

Attneave (1961) offers a more unitary view of the homunculus hypothesis, but 
like Bullock’s pluralistic view also fails to avoid the semantic regress, that is, 
entailing an unrestricted set of categories of experience. He argues that the 
homunculus is really just an “organism-within-an-organism” which possesses its 
own receptor surface nested within the receptor surface of the organism. “It is 
supposed,” he asserts, “that each receptor at the surface of H [the homunculus] 
is sensitive to a class of objects or events; for example, that the homunculus has 
a receptor for dogs, another for pianos, another for automobiles, and so on 
[p. 778] .” 

Although Attneave means the above classes of experiences to be treated as 
variables, and hence to be to some extent abstract, he fails to explain how the 
number of variables might be kept below the maximum possible. Since this 
maximum is clearly a potentially infinite set, again a set of principled restrictions 
is required if his theory is to avoid the regress of infinitely differentiating 
semantic categories. 

To be fair, it should be pointed out that this semantic regress is not the one 
Attneave or Bullock were trying to avoid in their theory sketches, but it is one 
that is just as damning for a theory of knowledge as the sequential stage regress 
entailed by other information-processing theories. Indeed, it is the semantic 
Tegress which has all but brought to a halt the psycholinguistic revolution of the 
1960s that was inaugurated by Chomsky’s transformational theory of English 
syntax. 

To the extent that our theories cannot avoid violation of Occam’s wise 
admonition not to multiply theoretical entities beyond necessity, the regressive 
fallacy will be a thwart to cogent theory construction in psychology. It matters 
little whether such regresses are potentially infinite in principle or of just 
theoretically unmanageable finite lengths, the outcome is the same— 
unacceptable theory. 

Thus, we must conclude that where sequential stage models offer no true 
Solution to the problem of psychological experience but only a corridor of 
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mirrors to endlessly reflect ourselves, so hierarchical theories like the above offer 
only a way of endlessly reflecting inwardly the infinite facets of the world. If the 
ecological approach can avoid either form of the regressive fallacy, it will have to 
be considered a front runner in the scientific sweepstakes. 

Hence, given the desire to avoid the regresses arising from the use of vague 
spatiotemporal metaphors or the assumption of unrestricted categories of experi- 
ence, it is wise to avoid altogether certain insidious forms of theoretic questions 
that entail these problems. We now see that such questions as, Where does the 
last stage of perceptual processing occur? or, Where in the sequence of stages can 
we fit the knowing-agent? are philosophical red herrings which lead our theories 
astray. In contrast, we shall argue that the ecological approach provides at least 
one sensible response to such questions. From the ecological viewpoint the 
knowing-agent is not some final stage in the epistemic process, some caboose at 
the end of a train of ideas, nor even a missing link in a cause-and-effect chain; 
rather the knowing-agent is the totalité of the process itself. Knowing is itself the 
process which resides neither wholely within the subject as a punctiform effect 
or response, nor wholely within the world as a cause or stimulus; rather, as an 
ecological concept; it stands like the mythical Colossus of Rhodes, astride the 
physical and psychological domains, one foot planted firmly on either shore. 
(More will be said about this in Part Two.) 

The ecological approach, unlike information-processing theories, denies also 
that nature, in any sense, communicates messages to us written in a kind of 
sensory shorthand which, to be comprehended, must be translated by a phalanx 
of cognitive homunculi into a more readable longhand for perusal by whom no 
one can say. Rather, it seems more parsimonious to assume that evolution has 
rendered such busy work unnecessary by designing perceptual systems, naturally 
selected over eons of practical use, to extract meaning directly from the structured 
energy propagations without benefit of any other stage of epistemic mediation. 

Hence, when asked where the buck of knowledge ceases to be passed, or where 
the epistemic regress ends, the ecological psychologist responds: At the begin- 
ning of the process; it stops with perception, the process upon which the 
intrinsic meaning of man’s relationship to his world is founded. It is the 
perceiver who knows and the knower who perceives, just as it is the world that is 
both perceived and known. Neither memory, nor inference, nor any other 
epistemic process than perception intervenes between the knowing-agent and the 
world it knows, for knowing is a direct rather than an indirect process. 


PART TWO: DIRECT VERSUS INDIRECT THEORIES OF KNOWING 


But do I hear a shout indirectly, when I hear the echo? If I touch you with a 
barge-pole, do I touch you indirectly? Or if you offer me a pig in a poke, 
might I feel the pig indirectly—through the poke? 


[J. L. Austin, 1962, p. 16] 
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It is sometimes argued that we do not perceive the world and its contents 
directly, rather we perceive something else in its place from which we then infer 
the world. Candidates for this “something else” have traditionally been such 
things as sense-data (for example, pointillistic patches of color, spots of texture, 
or margins of light contrast), or our own retinal images, or even states of our 
central nervous systems. Thus, from the standpoint of these hypotheses, we do 
not experience the world per se when engaged in the act conventionally known 
as “perceiving,” rather we infer the world’s objective existence from experience 
with the surrogate phantasms that it somehow gives rise to. 

The ecological approach disagrees with the hypotheses that perception is 
indirect whereas most information-processing theories agree with it. However, 
there are several points of agreement between the two approaches that must be 
discussed before their differences can be fully appreciate. These points of 
agreement revolve around classic philosophical positions regarding the relations 
of a knower to his world. 


Realism versus Nominalism and Phenomenalism 


Traditionally, the hypothesis of realism, in opposition to nominalism, asserts 
that abstract concepts or ideas may refer to properties, relations, objects, classes, 
etc., that have a real objective existence, that is, that their existence does not 
depend on their being perceived or thought about. 

By contrast nominalism asserts that abstract things, such as classes and relations, 
do not exist but may be denoted by words as useful fictions. From the stand- 
point of this doctrine such fictions must be constructed by the mind from the 
elementaristic aspects of objects as they might appear in isolation from other 
elementaristic aspects. (Later we will see how the doctrine of indirect realism, 
the version of realism championed by most information-processing approaches 
smacks of nominalism and how direct realism, the version of realism championed 
by the ecological approach, attempts to explain the direct apprehension of ab- 
stract entities, invariances, without recourse to cognitive construction.) 

For our purposes, the reading we would like to give to the hypothesis of 
realism is as follows: An external and objective world truly exists with real 
properties, abstract or otherwise, that may be experienced by virtue of percep- 
tion rather than experienced as fictitious constructions by virtue of intellective, 
mnemonic or imaginative processes. In short, it is the belief that perceptual 
knowledge of a world that exists apart from ourselves is truly possible. 

This view is also opposed to phenomenalism which asserts that all that truly 
exist are the phenomena we experience. Radical adherence to this view would 
deny what realists believe to be important distinctions between subjective and 
objective aspects of knowledge. To our knowledge, no living psychologist cur- 
tently embraces the doctrine of phenomenalism. Therefore, whatever distin- 
guishes the ecological approach from the information-processing approach, one 
can be sure that it is not the issue separating realism and phenomenalism. 
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What does distinguish the two approaches, however, is the brand of realism 
adhered to. Let us consider the various versions of the realist hypothesis which 
may or may not be relevant to distinguishing between the two approaches to 
psychological theory. 


The Varieties of Realism 


Direct versus indirect realisms. Theories of knowledge, and the perceptual 
theories entailed by them, come in two basic varieties: direct and indirect.’ 
Theorists who proceed on the assumption that we actually perceive the world, 
its properties, and contents directly instead of some surrogate for the world 
accept the hypothesis of direct realism. By contrast, realist theories that deny 
direct experience of the world but assume we experience the world secondhand, 
through the mediation of a representation or imagistic construct, accept the 
hypothesis of indirect realism. The difference between these two types of 
realism is not that one is perceptual while the other is nonperceptual, say, 
cognitive or intellectual; rather they differ with respect to what is believed to be 
perceived, the world or some world substitute. 

Put differently: The indirect realist rejects the idea that the world as such is 
the proximal cause of our perceptual experience, what we experience is believed 
to be one of the links in the cause-and-effect chain that binds the perceiver to 
the distal cause of his percept in the world. 

The sequence of causal links presumably consists of (a) the object or event, (b) 
the ambient energy modulated by it and propagated to the point of pickup in 
the receptor organ, (c) the receptor system itself including neural tracts and 
brain processes, and, finally, (d) the occurrence of the perceptual experience as 
part of the agent’s state of phenomenological awareness. 

Science has taught us that a break anywhere in this causal chain from object to 
percept prevents the phenomenological experience from occurring. For instance, 
without light no object is seen; without an intact eyeball, properly functioning 
optic nerve tract or brain processes, the perceiver is functionally blind. Thus, the 
items (a)-(d) are all a necessary part of the causal support of the percept. 

When fleshed out in appropriate detail, the above story is called the causal 
theory of perception (Russell, 1927), a story so widely accepted among contem- 
porary scientists that to deny it immediately places one’s credibility as a theorist 
in jeopardy. We do not mention this familiar story of how percepts are pre- 
sumably “caused” in order to contest it, but to remind ourselves and the reader 
that it is only a hypothesis. As the proverbial saying goes, sequential arguments 


"Later we will have reason to question seriously the usefulness of the indirect/direct 
realism distinction when applied to perception. For the time being, however, we will explore 


the distinction in a manner unprejudiced by this anticipation until it proves too problematic 
to continue to do so, 
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are only as strong as their weakest links. It is prudent, therefore, to recognize 
that the causal chain endorsed by this theory may have several weak links; 
moreover, which of the links presumed weakest is usually symptomatic of the 
version of realism held. Let us now explore each of the links for possible 
fractures. 


The mind-body problem. The perennial obstacle to a successful causal theory 
of perception is the so-called “mind-body” problem. Often it is argued that the 
linkage between world and percept is not properly homogeneous, since unlike 
the physical and biological links, the connection of final brain state to percept is 
problematic. How can a physical causal process give rise to a nonphysical 
(mental) experience? How can something so immaterial be the effect of a causal 
sequence that is quite material? Now is not the time, however, to attempt to 
resolve this problem, but it is worth noting that the problem of the relation of 
mind to body is not resolved by the causal theory of perception and, therefore, 
can play no role in its popular acceptance. 

A second weak link in the causal theory of perception supposed by many 
indirect realists is in the world-to-receptor hookup, a problem to which we now 
turn. 


Adumbrative Theories 


Some indirect theorists believe that the world is itself the weak link in the causal 
argument, that it fails to structure the energy distributions at the receptor organ 
in a manner sufficiently rich or precise to allow perception alone to be the 
source of our knowledge. Consequently, they argue that some constructive 
process must intervene to enrich, or adumbrate, the anemic perceptual informa- 
tion. The constructive process putatively required to enrich the available per- 
ceptual information, usually believed to be manifested in retinal images or brain 
states, is sometimes thought to be intellectual (inferential) (Helmholtz, 1925; 
Brunswik, 1956) or anamnesic (memorial) (Gregory, 1970) or both. Recently, in 
nonscientific circles, it has even become popular to speculate that the construc- 
tive process, in keeping with certain Eastern philosophies, may be imaginative 
(Pearce, 1971). 

The constructivist version of indirect perceptual theory, however, should not 
be confused with nonconstructivist versions of that theory which believe that 
although the retinal image or brain state is what is perceived, these constructs 
nevertheless provide unimpoverished, reasonable facsimiles of the world. In this 
case they are not viewed as intellectually enriched constructs but as high fidelity 
cross-sections of the flow of information from the world. More will be said 
about such nonconstructive versions of the indirect realists’ hypothesis later. 

Two versions of adumbrative or constructive theories deserve mention, those 
that partake of a priorism and those that partake of a weaker form of nativism. 
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The former is completely at odds with the ecological approach while some 
versions of the latter are much less so. 


A priorism and constructive theory. To adumbrate literally means “to give 
faint shadow or resemblance of; to outline or sketch; or to foreshadow or 
prefigure.” Consequently, the epithet adumbrative should be applied to all 
theories that claim information about the world must be restored to the retinal 
image, or cerebral states, in order to account for the observed richness of 
perceptual experiences and their close fit to things they represent. Merely, to call 
such theories “constructive” fails to capture the fact that the intervening 
construct is not just a haphazard or spontaneous arrangement of properties 
which, by sheer luck, happens to reflect the structure of the world in an 
appropriate fashion. Rather, the constructively augmented experience is pre- 
sumed to be constrained somehow to fit a faint memory of what is perceived, or 
is in some other remarkable way prefigured by inferential processes to cor- 
respond to the perceived object. 

The constructive theorist, therefore, to make this theory work, must explain 
how the constructive process becomes constrained by, and attuned to, physical 
reality in just such a way as to adumbrate, or sketch in, the ambiguous or 
missing portions of the retinal image, or cerebral state, so as to provide a true 
picture of reality. Without such an explanation, the constructive process be- 
comes utter imagination and, thereby, violates the basic tenet of realism which 
initiated its birth. 

Immanual Kant offered the most thorough attempt to face up to the problem 
of how raw experience (or what he called sensible intuition) is appropriately 
adumbrated by intellect in conformity with a real world that is itself inade- 
quately experienced. In Kant’s theory of knowledge, concepts such as number, 
substance and relation (the categories of pure reason) are assumed to apply to a 
world from which they were never learned. He called these synthetic a priori 
concepts. Their a priori character follows necessarily from the putative character 
of the world: although real rather than merely phenomenal, the world is also 
unknowable in itself (what is called noumenal). Consequently, the only positive 
fact about the properties of the real world that can be counted as necessary is 
that the a priori categories of mind do apply. Without this assumption, Kant 
argued, knowledge of the world would be impossible. On the truth of this 
fundamental presupposition relies the possibility of there being a principle to 
guide the adumbrative powers of mind. 

Although many contemporary constructive realists (for example, Piaget, 
Pribram, Fodor, and most information-processing theorists) might balk at certain 
specifics contained in Kant’s proposal, they must nevertheless accept, as a logical 
primitive of their own position as indirect realists, something comparable to 
Kant’s synthetic a priori. 

The cognitive capacity to adumbrate perceptual experience properly so as to 
allow one to parlay meager perceptual wagers into true knowledge of the world, 
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not only must be present at birth, since such a power cannot be perceptually 
learned, but must also be present from the dawn of human intelligence. 

Unlike Kant, current indirect theorists, who believe that adumbration of 
perceptual experience is required, have Darwin to fall back on to explain the 
emergence of an externally controlled, constructive capacity of mind. Thus, we 
turn now to the evolutionist’s alternative to a priori realism. 


Evolutionism. We now see that a bone of contention between indirect realists 
of the constructive variety and direct realists is captured in the phrase, per- 
ceptual knowledge is impossible without proper adumbration by cognition. This 
is a recalcitrant difference when the adumbration process is assumed to be truly 
a priori; however, when other interpretations of the above thesis are invoked the 
waters become muddied again. 

It is a mistake to see the difference between constructive and direct realism as 
merely another instance of the classic confrontation between nativism and 
empiricism. This issue is more complicated today than in earlier centuries before 
Darwin, given the restricted forms of nativism offered by evolutionary princi- 
ples. Ideas believed innate to an individual need not be believed innate to a 
species. Thus the issue as to what constitutes a priori knowledge is no longer as 
clear-cut as before, for what is a priori to an individual member of a species may 
not be a priori for the species. In this way evolutionism has taken the sting out 
of nativism; one need not be a radical rationalist in order to be a nativist of some 
sort. 

Consequently, the implied nativism of constructive theorists may even find its 
way, in a more temperate form to be sure, into the arsenal of ideas of the 
ecological theorist. Direct realists of this persuasion (for example, Gibson, Mace, 
Turvey, the authors of this chapter) do not forsake the proposition that the 
perceptual systems not only become tuned to the world by experience, but are 
genetically preattuned by the evolutionary experience of the species. Indeed, 
given current evidence, it would be foolish to believe otherwise (see Gibson, this 
volume; Shaw & McIntyre, 1974). 

Similarly, given the empiricist flavor of behaviorist learning theories of the 
1950s that has leached over into cognitive theories of the 1960s and 1970s, most 
information-processing theorists would feel uncomfortable being classified with 
radical nativists who avow a priorism. Moreover, it is not totally inaccurate to 
classify many contemporary psychologist who give lip service to the cognitive 
approach as cryptoneoempiricists. Neither, however, should we overlook the 
contemporary cognitive theorists (for example, Chomsky, Piaget, Pribram) who 
might reject the label of a priorist if this were interpreted as cynicism regarding a 
programmatic solution to the problem of innate ideas promised by evolutionists. 

The upshot of the above discussion is that both ecological psychologists and 
information-processing ones, as proponents of direct and indirect realism, re- 
spectively, emphatically disavow the thesis of radical empiricism—namely, that 
the mind of man at birth is a blank tablet such that all psychological abilities are 
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traceable to the individual’s particular learning experiences. Instead the issue 
separating the two approaches reduces to the interpretation of how evolutionary 
attunement plays a role in the adumbration of perceptual experiences—that it 
does somehow and to some extent is not in question. 

For the ecological theorist, evolution plays an indispensable role in perception 
by designing the perceptual systems of individuals to be especially attuned to the 
most significant dimensions (not specific contents!) of information required for 
adaptive functioning within the ecological niche provided their species by 
nature. Perception, however, is no less direct for having been genetically preat- 
tuned (an adumbrative process), for what is perceptually experienced is solely 
the information presented to the senses by the world. All that genetic preattune- 
ment accomplishes is greater facility for the perceptual systems differentiating 
the information made available by the environment so as to extract perceptual 
invariants that embody the ecologically significant properties of the perceiver’s 
world. Knowing the world truly, if not completely, is inevitable given the direct 
nature of perception and the sufficiency of the information made available to 
both species and its individual members. 

Therefore, no enrichment of perceptual information by nonperceptual pro- 
cesses is needed, since the phenomenological experience that results from the 
proper pickup of that information provides a sufficient basis for knowing how to 
survive, maintain health, and achieve reasonable success in other endeavors. 


The Possibility of Error 


Under normal circumstances our perceptual systems are quite capable of picking 
up information that provides true specification of most significant properties of 
the world. On occasion, however, when certain necessary conditions on observa- 
tion are not met, the available information may be deficient or perceptual 
sampling so restricted that errors in judgment ensue. For instance, if the 
biological support for perception is withdrawn, say, by some physical or chemi- 
cal trauma to the central nervous system, proper modulation of the available 
information may become impossible. These insults to the perceptual systems 
may lead to distortions of the information picked up or even to hallucinatory 
experiences. Of course the possibility of detecting such problems in perception 
logically presupposes the ultimate availability of sufficient information to 
specify what is true. 

Another source of faulty judgment about the nature of what is perceived might 
arise from deficient information—that is, from the physical support of percep- 
tion being withdrawn, say, because of contingent restrictions on perceptual 
sampling. A person who is unable to see an object from the proper distance, 
from the proper perspective, or under proper illumination, may judge the nature 
of his experience incorrectly because he elects to go beyond the information 


1, INTRODUCTION 17 


given. Again it follows from the fact that such situations can ultimately be 
corrected, on the basis of further perceptual sampling, that true perceptual 
experience is possible. 

The fundamental assumption of the ecological approach is simply that con- 
tingently deficient perceptual information may be rendered sufficient for correct 
judgment by allowing the perceiving agent more time to sample and/or more 
space in which to maneuver (Gibson, 1966). For instance, one may believe the 
facade on a movie set to be a real castle—an error in judgment, until one takes 
the time to notice its papier maché quality or to walk around behind it. Gibson 
(1966) defines perception as the pickup of information over time—a sufficient 
amount of time, that is, for extracting the significant information available over 
space to correctly judge the nature of ecologically relevant phenomena. Thus, 
the point to be emphasized is that cognitive judgments are parasitic on percep- 
tual experiences; they are logically posterior to perception rather than either 
anterior to it or part of it. 

Furthermore, the ecological position assumes that the majority of judgments 
relevant to the survival, health and welfare of a person can, in principle, be 
validated or invalidated on the basis of perceptual evidence alone, Hence 
adumbration of perceptual experiences by nonperceptual processes is rendered 
unnecessary by evolution since by natural selection the perceptual systems may 
become optimally attuned by their biological design to pick up certain forms of 
information and not others. For instance, the frog will perceive a small object in 
motion (for example, a flying insect) and flick it from the air with its sticky 
tongue while ignoring the same small object when static in its field of vision. 
Thus the specialization of the sense, through evolution, to pick up ecologically 
relevant information in an optimal fashion, constitutes a direct adumbrative 
process, a process that guides perceptual differentiation through the welter of 
available information without recourse to indirect adumbrative processes of a 
nonperceptual origin. To assume otherwise that evolution attuned the perceptual 
systems to function erroneously, to pick up misleading information, is to assume 
that evolution is a fundamentally maladaptive process—an assumption that 
undercuts its theoretical value as a scientific hypothesis. Consequently, one must 
be careful not to confuse the sometimes unreliable functioning of the perceptual 
systems as physiological processes with evidence that they are always funda- 
mentally in error. Error, like success, is a relative term to be measured against 
the degree of adaptive behavior exhibited by people and animals day by day in 
coping with their worlds—an accomplishment that is by no means perfect, but 
far from being random or negatively correlated with success as the sheer number 
of people populating our planet now attests. 

Thus, it would be a fundamental mistake to argue that because the necessary 
conditions for perception can on occasion be violated, then perception is an 
unreliable source of knowledge about the world. The laws governing perception 
are no different from other natural laws in this respect—true knowledge about 
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the world is only yielded if certain conditions on their application are observed, 
what is sometimes called their initial and boundary conditions. 

Thus, the fundamental claim of the ecological approach, as a version of direct 
realism, is in no way infirmed by the admission of the possibility of error in 
perceptual judgment, since this in no way gives carte blanche to the charge that 
perception is unlawful and, hence, incapable of yielding reliable knowledge 
about the world. Rather it is only to admit what must be true of any lawful 
process, namely, that whenever the ceteris paribus clause is satisfied and the 
senses function reliably, then perceptual information specifies true propositions 
about the world. 


Naive versus critical realism. Sometimes it is claimed, and quite unfairly, that 
direct realism necessarily entails naive realism (Maxwell, 1970, pp. 3-34). Naive 
realism is the view that what is known about the world is both unadulterated 
and unexpurgated with respect to even its most subtle details, as such, the view 
roundly denies Kant’s pessimistic claim that the world is noumenal or essentially 
unknowable in itself. 

Perceptual theories based on naive realism are the most optimistic of all 
theories, since they claim that the information needed to specify all the proper- 
ties of the world is readily available to the senses. Hence the world is exactly as 
it seems (Russell, 1940, p. 14). 

For all practical purposes naive realism becomes indistinguishable from phe- 
nomenalism, since whatever is deemed known is also deemed perceivable; thus 
perceptual phenomena comprise all that is real. But direct realism purports to 
reject phenomenalism just as assuredly as does indirect realism; must we con- 
clude then that direct realists are deluded when believing themselves to be 
realists, since in fact, they are de facto nominalists? Or, perhaps, the claim that 
direct realism entails naive realism may well be false. Indeed this criticism can be 
shown to be false by carefully considering several dimensions of knowledge on 
which various types of realisms may be compared. 

Consider a scale on which we rate the degree of completeness of our knowl- 
edge of reality. Imagine, say, an eleven point scale ranging from 0, or no 
knowledge, on one end to 10, or total knowledge, on the other. The most 
optimistic of all theorists, the naive realist, would provide an anchor point on 
the upper end of the scale, with a score of 10 points, indicating a belief that our 
knowledge of the world is perfect. Holding down the anchor point on the lower 
end of the scale, and receiving a score of 0, would be the most pessimistic of all 
theorists, the cynical realist. Such a theorist, as a realist, still believes in the 
existence of an external world but denies that we can know anything whatever 
about it. 

On this end of the scale, for instance, we might place Kant in his most 
pessimistic moments, say, before he became convinced that the categories of 
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pure reason necessarily apply to the noumenal world. However, immediately 
after formulating the argument that we must be capable of knowing at least one 
fact about the world, namely, that the synthetic a priori categories apply, Kant 
might deserve a small positive score on the scale, say, a 1.0. 

Where should we place those believers in direct and indirect realism so as to 
express their opposition to one another? If this scale of completeness of 
knowledge is truly the dimension on which the two versions of realism may be 
compared, then they should occupy polar positions on the scale. Furthermore, if 
direct realism necessarily entails naive realism then the two should fall together 
on the upper end of the scale. On the other hand, if neither of the above claims 
holds, then the positions on the scale held by direct and indirect realists should 
prove uncorrelated. 

It only takes a moment’s consideration to see that this latter statement holds, 
for neither direct or indirect realists believe knowledge of reality to be total. The 
issue that separates the two is not how much can be known but by what means 
whatever is known is known. Hence, in sharp disagreement with naive realists, 
direct realists no less than indirect realists may agree with Kant that we know 
little about the world itself. In this case, the proper qualifier for the realism 
adhered to would be critical.” 

Thus, it is quite possible for a person to be either a direct realist, or indirect 
realist, while at the same time being a-critical realist-one who believes our 
knowledge of the world to be something less than perfect. Unlike the naive 
realist, who believes things appear exactly as they are, the critical realist realizes 
that one’s judgment may be led astray if the information sampled is accidentally 
impoverished, or if the perceiver’s ability to sample the information is artificially 
restricted. (Recall the earlier example of the paper maché castle; also see 
Gibson’s 1966 theory of illusions.) 

The characterization of Gibson’s form of realism is clearly critical, as Mary 
Henle (1974) correctly observes (although she seems to see an inconsistency in 


* Historically speaking, critical realism is the name primarily attributed to two groups of 
American philosophers which include such thinkers as Durant Drake, A. K. Rogers, George 
Santayana, and C. A. Strong on the one hand, and A. O. Lovejoy, J. B. Pratt, and R. W. 
Sellars on the other. Both groups agreed that what we are directly aware of in perception are 
not actually properties of external objects but are “character-complexes . . . irresistibly 
taken, in the moment of perception to be the characters of existing outer objects [Drake, 
1920, p. 20].” The usage, however, that we are giving to the term is broader than their more 
narrow connotation. In contrast to their view which was dualistic, we do not mean to imply 
either dualism or monism. Rather what we see as the issue among critical realists of various 
persuasions (for example, direct versus indirect realists) devolves upon the theoretical 
interpretation of “character-complexes” and their relation to the objects they specify (that 
is, is the relation direct or indirect). We can see no purpose served by arbitrarily restricting 
the doctrine of critical realism to the narrower interpretation that accident gives historical 
precedent. 
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one professing both a direct and critical realism, presumably because she mis- 
takenly equates direct realism with naive realism).” Thus, on the scale denoting 
completeness of knowledge, critical realists would populate all those regions 
lying between naive realism and cynical realism, with their exact position being 
determined by the degree of pessimism or optimism held. 

But notice, the issue between naive realists and critical realists, in general, is 
not one of the veridicality of perceptual knowledge but of its completeness 
(although one might wish to puzzle over the ways incomplete knowledge may 
lead one to false judgments). 

The obvious scale on which direct and indirect theories of perception may be 
contrasted is degree of proximity of the object of perceptual experience to the 
information source in the world which determines it. In other words, assuming 
the causal theory of perception, one might rate the distance separating the 
proximal cause of the percept from its distal material cause in the world. 
Consequently, by definition, direct realists would receive a minimal score of 0, 
since they believe the proximal and distal cause of perception to be the same. 

Indirect realists, on the other hand, receive positive distance scores, depending 
on the link in the causal chain believed to be the proximal cause of the percept 
(for example, retinal image or brain state). By contrast, naive realists could be 
assigned any score whatever, since any link in the causal chain, from world, 
energy propagation, receptor state, to brain state, may be presumed to provide a 
complete specification of the world as it stands, indirect though that specifica- 
tion may be. All that is required for naive realism to hold is that the causal stage 
experienced be a source of complete knowledge of the world. 

If the stage experienced is not the world itself, then that stage provides 
epistemic mediation for the person’s knowledge of the world. It is important to 
realize that whatever link in the causal chain:is presumed to be the proximal 
cause of perceptual experience, that link is directly perceived. To assume 
otherwise would lead to a regress of endlessly interpolating epistemic mediators 
between the knowing-agent and the object known with nothing ever qualifying 
as a true proximal cause of perceptual experience. 

Naive realism requires only that the epistemic mediator be perfect, but in no 
way mandates the link to be experienced. The naive realist who believes the 
world itself to be the object of experience is also a direct realist. But this is not 
the general case, for other naive realists may believe some other stage in the 
causal sequence to effect epistemic mediation with the world; thus they would 
be indirect realists. Any view, direct or otherwise, which believes epistemic 
mediation to be perfect is by that belief a naive realism. Therefore, the claim 
that direct realism necessarily entails naive realism seems quite certainly false. 


*Or, perhaps, she believes that critical realism entails dualism while direct realism does 
not. This would be in keeping with the traditional interpretation of critical realism as 
discussed in the preceding footnote. 
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The goal of this section was to show that although direct and indirect theories 
of perception partake of different forms of realism, neither can be rejected out 
of hand for being unscientific, that is, for being obviously naive, uncritical or in 
violation of the basic precept of causal explanation, a precept believed by many 
to be essential to any completely adequate scientific explanation. However the 
two types of theories are to be distinguished, the difference must lie elsewhere. 

To discover the basis of their distinction, it would be wise to look more 
carefully at the concept of epistemic mediation. If this concept is found to apply 
validly to perception, then a case for indirect perception might be made (as 
many theorists already believe). On the other hand, if this concept proves to be 
theoretically vacuous, to serve no real explanatory purpose, then a case in 
support of direct perception will be made (as a few of us believe can be). 

In Part Three, we question whether there is any logical necessity at all for 
believing perception to be indirect, indeed whether the term itself has any clear 
meaning when applied to perception. 


PART THREE: THE “DIRECT” PERCEPTION OF 
INVARIANT INFORMATION 


Philosophers, it is said, “are not, for the most part, prepared to admit that 
such objects as pens or cigarettes are ever directly perceived.” Now of course 
what brings us up short here is the word “directly”—a great favorite among 
philosophers, but actually one of the less conspicuous snakes in the linguistic 
grass . . . it is essential to realize that here the notion of perceiving indirectly 
wears the trousers—‘‘directly” takes whatever sense it has from the contrast 
with its opposite. .. . 


[J. L. Austin, 1962, p. 15, commenting 
on A. J. Ayer, 1940, pp. 1-2]. 


Taking Stock 


Before returning to the crux of the issue separating direct and indirect realists, 
let us take stock of those issues that do not necessarily divide them. The above 
discussion, if correct, suggests a broader base of common agreement than might 
first have been apparent. 


1. The rejection of phenomenalism. It goes without saying that, as realisms, 
both views by definition oppose phenomenalism in their common belief that 
there is more to the world than perceptual phenomena. Moreover, they concur 
with the popular viewpoint of contemporary physics, namely, that some proper- 
ties and entities may be unobservables and yet be known through rational 
inference. For instance the track in a hydrogen bubble chamber or Wilson cloud 
chamber may provide perceptual evidence for the postulation of inferred entities 
given one’s acceptance of certain physical laws and theoretical assumptions 
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provided by quantum mechanics. Indeed, both views may even agree that there 
exist some types of subatomic particles which are in principle not visible (for 
example, quarks). 

This concession in no way jeopardizes the consistency of the direct realists’ 
belief that perception, one mode of knowing, is direct, while inference, another 
mode of knowing, is indirect; for inferring is not perceiving and it would be a 
grievous mistake to treat them equivalently. In perceiving, according to direct 
realism, the object known is experienced, while in inferring, the object known is 
not experienced. (For instance, one perceives the bubble track of a particle but 
infers the particle). Hence perception and inference are not, epistemically 
speaking, processes of the same type. This distinction allows for agreement on a 
second proposition. 

2. The possibility of error in knowledge. Since inference and perception are 
epistemically distinct one might lead to error while the other might not. Both 
views wish to account for error, since neither believes knowledge to be perfect. 
The indirect realist is willing to entertain the possibility that our senses may 
somehow “lie” to us—but this is surely metaphorical, for our senses are dumb, 
they do not fell us anything. It is the person, not the senses, who judges by 
inferring conclusions, just or no, from perceptual experiences. Thus, for the 
direct realist, perception merely is what it is, it is in principle neither true nor 
false; rather, it is the judgment inferred from premises believed applicable to the 
percept that may sometimes go astray. To think otherwise is to commit what 
Ryle (1949) called a category mistake, mistaking an entity of one type with 
that of another, such as confusing a word with the object it takes. 

3. Evolutionism. It is unfair for either view to suggest that the other neces- 
sarily takes a radical position with respect to either rationalism or empiricism. 
Given the option of evolutionism as a reasonable compromise between the 
extreme positions of radical nativism (a priorism) or radical empiricism (behav- 
iorism), either theorist is free to accept the scientifically popular, neo-Darwinian 
compromise without fear of fundamental inconsistency. The direct realist is 
perfectly willing to accept the possibility that the perceptual systems have been 
genetically preattuned to differentiate environmental information in a selective 
manner. Thus, the adumbrative process need not have a nonperceptual origin. If 
one assumes it does, as some constructuve theorists do, then there is truly the 
danger of sliding into a priorism. 

4. The acceptance of critical realism. One of the least thoughtful, and yet 
most persistent complaints levied by indirect realists at direct realism is the claim 
that it either (a) necessarily entails naive realism or (b) is identical with naive 
realism. Both claims can be countered by showing that direct realism, no less 
than indirect realism, may assume knowledge of the world is less than perfect, 
that given the information available over some restricted temporal or spatial 
period, how the world seems may not be how it truly is. Thus, when rating the 
two views on a scale of completeness of knowledge about the world we find that 
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either direct or indirect realism may vary from naive realism (a score of perfect) 
to cynical realism (a score of zero). 

5. The ultimate criticism. There is one final charge that might be raised 
against direct realism and the theory of direct perception entailed by such. 
Admitting that the theory of direct perception is no less scientific than theories 
of indirect perception, one might conclude that for all practical purposes the 
two types of theories are the same, one serving no better as a guide to research 
than the other. Perhaps, one might even go so far as to argue that, to the extent 
that the theory of direct perception is concocted to be reasonable, it conflates 
onto some version of the theory of indirect perception. This criticism is not so 
farfetched as might first be thought. 

Unfortunately, it is indeed sometimes the case that sympathetic interpreters of 
Gibson’s theory of direct perception equivocate when asked if one perceives the 
world or just the information made available by the world. For instance, the 
critic might point out that Gibson himself defines perception as the pickup of 
information over time. Therefore, if the world makes the information available, 
does not information stand as an epistemic mediator between the perceiver and 
its world? Hence, we should legitimately conclude then that, by positing in- 
formation as a mediating construct, Gibson’s own theory of perception can 
properly be dubbed “indirect.” 

If we assume, however, that the causal theory of perception holds, the critic 
might concede, for sake of detente, that Gibson’s theory may truly be con- 
sidered Jess indirect than other theories because it assumes the proximal cause of 
perception to be a more distal link in the causal chain than that assumed by 
other indirect theories—say, by retinal image or brain state theories. In other 
words, Gibson’s theory assumes the perceptual experience to be of information 
carried by energy distributions in the environment lying outside the perceiver’s 
body rather than being an experience of retinal images or brain states lying quite 
inside the perceiver’s body. 

Although this interpretation allows Gibson’s theory of perception to be less 
indirect than others, it would nevertheless be indirect—differing from retinal 
image theory or brain state theory only quantitatively rather than qualitatively. 
From here the issue between the ecological approach and other types of 
information-processing theories becomes, as Pribram (this volume) suggests, 
merely one of determining experimentally whether perception is the detection 
of information instantiated in the ambient energy arrays of the physical environ- 
ment or the detection of information instantiated in physiological constructs, 
say, in the dynamic states of receptor or cerebral processes. 

As fair as this compromise may seem, it is too slick, for it shoots right past the 
real issue, namely, the claim by direct realism that knowledge of the world is 
possible only because we do perceive the world and not merely some phantom 
surrogate of the world. Moreover, what is at stake between theories of direct and 
indirect perception is the practical issue of where our field should focus its 
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resources in an attempt to characterize what is most directly responsible for 
perceptual knowledge. 

Should our research efforts be directed at experimental manipulation of retinal 
or cerebral variables and the development of corresponding dependent measures, 
or should our research efforts attempt to “ecologize” physics in the hope of 
discovering the direct source of perceptual knowledge and corresponding ways 
of charactering it? Thus, the resolution of the issue of whether perception is 
direct or indirect is not merely of academic concern but has practical implica- 
tions for research and how researchers should be trained. 

Given the above concerns, it becomes crucial that we resolve this last issue of 
whether direct perception can be treated de facto as a disguised form of indirect 
perception. We turn now to a resolution of this problem. 


The Meaning of Being Direct 


To fully appreciate why the above claim is invalid—why direct perception can in 
no sense be reduced to a version of indirect perception—it is helpful to consider 
the meaning of the two terms. Austin (1962) makes several interesting observa- 
tions regarding their relationship as polar concepts. 

In the passage quoted at the beginning of Part Three, Austin suggests that the 
term “direct” is “one of the less conspicuous snakes in the linguistic grass”; its 
sense may remain obscure until placed in sharp contrast to some precise usage of 
the term “indirect.” As he so elegantly puts it: It is the latter term which clearly 
“wears the trousers”—that is, which takes priority in setting usage in linguistic 
contexts. To see why this is so, let us consider an analogous case where the 
meaning of one member of a contrastive pair of terms, the dominant one, 
determines which of several possible connotations the second member is to take. 

To understand what “colored” means we must first understand what “un- 
colored” means. Since the normal circumstance for humans is to be able to 
perceive the multicolored facets of the world, to draw our attention to this 
property dimension, it clearly helps to contrast it with its opposite—“un- 
colored.” However, the meaning of the latter term is ambiguous. One sense of 
being uncolored is to be colorless such as pure air, water, or glass. Here 
“uncolored” means colorless or transparent. But another sense of “uncolored” 
means to be without hue, as in the case of neutral gray pigments or white 
pigments, where there is a careful balance of all hues. Thus, if someone were to 
say “I saw a colored object today,” we would not know immediately what was 
meant. The sense that “colored” takes depends by contrast upon which sense of 
“uncolored” was intended. 

For instance, if the linguistic contrast intended was between “‘colored” and 
“uncolored,” in the sense of colorless or transparent, then “colored” would be 
synonymous with “opaque.” On the other hand, if the intended linguistic 
contrast is between “colored” and “uncolored,” in the sense of neutral gray, 
then “colored” would be synonymous with “tinted” (by some hue). Here we 
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find that the sense of “colored” is less conspicuous than that of “uncolored”; 
hence, it is the latter term which (like “indirect””) “wears the trousers” in the 
contrastive relationship and, therefore, whose sense must be clarified first. 
(Terms such as “uncolored” are sometimes said to be semantically marked.) 
Consequently, we should not be surprised that the technical use of the terms 
“direct” and “indirect” can be equally problematic. 

An important suggestion, however, emerges from the above argument that one 
member of a linguistic contrast may be dominant in setting the sense of the 
other (that is, semantically marked); it is in theories which adopt the term 
“indirect” as a key concept that technical language is most likely to go astray. 
Thus, contrary to what is sometimes claimed, the burden of clarification na- 
turally falls on indirect realists rather than on direct realists since it is their half 
of the contrast that is semantically marked. 

To see why this is so, contrast the claim that one can perceive objects 
indirectly with the claim that one can see unreal objects. Austin (1962) suggests 
the following scenario for a conversation revolving around the latter claim: “I 
can see an unreal duck”—“What on earth do you mean?’’—“It’s a decoy duck”— 
“Ah, I see! Why didn’t you say so at once? [p. 18].” Here we understand 
specifically what “unreal” means (that is, artificial) so that if someone then 
remarked, “Oh, but next to the decoy is a real duck,” we would then know 
exactly the sense intended (that is, a living, flesh and blood duck). 

In a similar vein, to grasp what is intended by the assertion that we perceive 
the world directly requires that we first determine what it would mean to 
perceive the world indirectly. However, if the latter semantically marked term 
turns out to have no useful interpretation, then the former contrastive but 
unmarked term becomes vestigial and can be dropped from usage. In such a case, 
the assertion that we perceive the world directly could then be shortened, 
without loss of meaning, back to its original, commonsense form of simply we 
perceive the world—something few people would question until tutored in 
certain brands of philosophy. 

If the above argument is correct, and “direct” is the semantically unmarked 
term in the linguistic contrast, then this may account for the difficulty en- 
countered by so-called direct realists (for example, J. J. Gibson) in making their 
positions clear, when asked to do so outside of the countercontext provided by a 
well articulated indirect theory. Hence the obstacles which block the path to a 
clear understanding of how perception may in principle be indirect, a fortiori 
block the path to a clear understanding of how perception may in principle be 
direct. Let us consider some of these obstacles. 


The Meaning of Perception Being Indirect 


The chief obstacle to the acceptance of theories that knowledge is indirect is the 
simple fact that people are very good at noticing the world and its properties but 
terribly inept at noticing the properties of constructs hypothesized to mediate 
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perception. When we dodge a baseball, we do not seem to notice that the retinal 
image projected from the oncoming missile through the lens of the eye must be 
upside down and backward. People and cars down the block do not appear 
smallish or to change size upon approach and recession as their retinal images 
must; rather they are experienced as possessing the stature and substance we 
know them to have. Similarly, when biting into an apple, we experience the 
juicy, red object as existing in our hands and not in our brains or eyeballs. 

Thus, it is indirect theorists rather than direct theorists who must defend their 
theoretical circumlocutions since they have been concocted to disallow our 
speaking of the objects of perception as being what they seem. This is only fair, 
for whenever a theory goes against the natural grain of experiences common to 
all people, the burden of motivating unnatural expressions or surprising conclu- 
sions must fall to the offending party. Again, to follow Austin’s line of illustra- 
tion for this point consider the following: 

If someone reports to you, during a time of war, that a sighting of enemy ships 
is indirect, then it should merely provoke the question of what exactly is meant. 
If, however, that person goes on to explain that what actually was seen were 
blips on a radar screen and not the ships themselves, then the natural reply 
should be, “Well, why didn’t you say so in the first place!” 

Clearly, in the light of what might be meant by the claim that a sighting of 
enemy ships is indirect, (that is, their blips were seen on radar), it is now possible 
to interpret a counterclaim that a sighting of enemy ships might thereby be 
direct, say, as a sighting by telescope from the crows nest of a ship. However, the 
more radical claim by an indirect realist that even a sighting of enemy ships 
through a telescope is also really indirect suggests that only a sighting by the 
naked eye would constitute a direct perception of the enemy ships. But the 
recalcitrant indirect realist must not let go of his argument at this point, but 
must push on with the surprising claim that even a sighting of enemy ships by 
the naked eye is actually indirect—in some extraordinary sense to be sure. 

What ultimate meaning for the term “direct” does this latter claim imply—that 
direct perception of something must be accomplished without the lens of the 
eye since it too casts images to the eyeless homunculus? And since we might 
consider the cortical projections from the eye to be images, (at least in the 
mathematical sense), must a direct perceiver be excised of his visual cortex as 
well? 

This interpretation of “indirect” perception thus leads to an absurd interpreta- 
tion of “direct” perception. Moreover, such a radical interpretation of indirect 
becomes hoisted on its own petard, for if no clear meaning can be attributed to 
one half of a linguistic contrast, none can be attributed to the other either. 
Again we find that it is the indirect theorist who ultimately must explain his 
concept first before the direct theorist can know what to say in response. 

All that the direct perceptual theorist can offer is the boring repetition of his 
first premise; namely, that the perception of X is indeed what it seems—the 
perception of X rather than Y. If all the indirect theorist can do is persist in 
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inexplicably claiming that the direct theorist’s claim is absurd on prima facie 
grounds—that the perception of X is really the perception of Y—an impasse to 
further reasonable argument must be recognized. 

Or, perhaps, the linguistic problem of attributing a clear meaning to “in- 
direct,” since it is the dominant or semantically marked half of the indirect/ 
direct perception contrast, is sympotomatic of a deeper philosophical confusion. 
To dramatize this possible confusion and to illustrate some of its practical 
implications for perceptual theory, let us again paint a scenario of a possible 
confrontation between the two opposing theorists. Bear in mind, also, that the 
core of the disagreement may be as much one of attitude as it is of substance. 

If you were to tell a friend that he did not really see his wife when looking 
straight into her uncovered face while awake, sober, and attentive, then he 
would be quite astounded. His next question should be to ask if you had played 
a trick on him by substituting in the place of his wife someone who looked 
remarkably like her, say an actress with clever makeup or an unsuspected 
identical twin sister. Furthermore, on denial by you that either was the case, he 
would be quite perplexed regarding what you meant. 

You might explain that what you meant to say was that he did not see her 
directly as supposed but indirectly. However, on touching her warm, animate, 
three-dimensional body instead of the coolor, flat surface of a mirror, your 
friend would feel quite justified in vociferously denying your claim. 

You may then continue, if your patience persists, to explain that you did not 
mean to claim that he saw her image reflected in a mirror, but her image as 
reflected in the eye. At this your friend should reply that he never thought 
otherwise, but express perplexity that you should call this an indirect view of his 
wife, 

Perhaps, with a creeping feeling of futility, you might try one final sally to put 
over the subtlety of your point! “Look, my good man, what I really mean to say 
is that all you, or anyone else for that matter, have ever seen or will ever see—all, 
mind you, are but little phantasms of the world in the eye—flattish, smallish, 
images of objects on the back of the eyeball!”—“Oh, God!” your friend might 
then gasp in complete befuddlement, unable to continue. 

Most likely the argument would end here with you thinking your friend quite 
dense and possessing no head for science, and with him quite convinced that you 
are more than just a little mad. In a less jocular vein, however, it should be 
pointed out that the above dialogue dramatizes the extreme difficulty direct and 
indirect realists have traditionally faced when trying to convey their respective 
positions to one another. The difference is indeed as much one of attitude as of 
substance. We should, therefore, take great care to consider fairly the merits of 
each position and avoid, so far as possible, engaging in the easy polemics that 
arise from either attitude. It helps to remember that neither view, as argued 
before, is inherently unscientific, although one may be better science than the 
other. 

The chief obstacle to clarifying the indirect view, we now see, devolves upon 
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the simple fact that people are very good at noticing the world but terribly inept 
at noticing the properties of the mediating constructs postulated as way stations 
for perception. Indeed, only after years of practice do artists even approach the 
skill of attending to objects by reduced vision—the ability to disregard the 
effects of the constancies of size, shape, color, and brightness (Arnheim, 1964; 
Hester, this volume.) 

Furthermore, the fact that people can develop the ability to attend to the 
properties of the retinal image provides a source of evidence that runs counter to 
the indirect realist’s hypothesis, since the perceived properties of the retinal 
image are not at all like the properties of the real objects. The former exhibits 
none of the constancies that characterize the latter and hence are readily 
distinguishable from them. How then can retinal images be the basis of our 
perceptual experience of objects? Perhaps, a better guess is that retinal images 
(or other sensory icons) are but epiphenomenal spill-over from the perceptual 
process by which we experience the world. 

In the final section of this chapter, we explore the suggestion that, when 
properly construed, direct realism provides an acceptable scientific framework 
within which theories of perception might be constructed. However, our tactic 
will be to question the fruitfulness of the direct/indirect dichotomy for percep- 
tion, not because no clear theory of direct perception is possible, but because no 
valid theory of indirect perception is available with which it might be contrasted. 
Without such a contrast, the epithet “direct” serves no real purpose. 

Of course, we do not mean to deny the obvious fact that traditionally a 
contrast between the two types of theories has been verbally drawn; rather we 
will try to show that such a contrast is, at best, trivially true where the term 
“indirect” may be validly applied to perception, and, at worst, misleading when 
the term “indirect” is misapplied to perception. 


Perceiving is not inferring. The key question which motivates the theoretical 
position that perception is indirect might be phrased as follows: If all we are 
capable of experiencing are our retinal images or brain states, as often claimed, 
then why should we ever suppose that our experience even indirectly refers to a 
real world? If all that we experience is “within” our own heads, then how did we 
ever come to the realist’s conclusion that anything exists completely and 
independently outside ourselves? In other words, how can indirect realism avoid 
falling into phenomenalism? 

The indirect realist typically answers this charge by claiming that we infer that 
some of the properties of our experience are true of the world. But, unfortu- 
nately for this view, we are not aware that we are merely inferring the fit of such 
properties to the world; rather we feel an irresistible compulsion to attribute 
such properties to the world while in the act of experiencing them. At least such 
a one-step inferential process, if that is what it is, seems to be direct. 

To claim, however, that our experience of the world is really an experience of 
such a direct inference, is merely to regress the original question rather than to 
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answer it; for now we must ask why an experience of an inference should ever be 
taken to be a perceptual experience of the real properties of the world—the very 
question with which we began. 

Furthermore, it will not do simply to assume that inferences somehow may, 
willynilly, be taken for perceptions. Rather, how such a mistake can occur 
requires a cogent theory to explain the nature of the confusion—a theory which 
must be supplied by its claimant, the indirect theorist, and not by the direct 
theorist. Consider an absurd implication of such a claim: When we infer that 
“Socrates is mortal” is a true statement from the proverbial syllogism, in no 
sense are we compelled to believe that the man Socrates, who we are told lived 
more than 2000 years ago, is part of our perceptual experience of the current 
world. Indeed, if we should experience whatever we infer, then it would be more 
accurate to call such experiences hallucinations rather than perceptions. 

Moreover, by this line of reasoning we might suppose that logicians and 
lawyers, who presumably do more inferential work than most of us, should show 
signs of hallucinating much of the time, while schizophrenics, who presumably 
do not reason well at all, should yield the least evidence of hallucinating, with 
ordinary people falling somewhere in between. Thus, the claim that inferential 
experience can be phenomenologically as rich as perception, must be false, or at 
best only metaphorically true. Unfortunately, such a metaphor seems difficult to 
comprehend. Perception is not inferential and no verbal sleight of hand can 
equate the two. 


Perception is not memory. An equally silly claim is that perceptual experi- 
ence is somehow based on memory. If we take the claim literally, then we must 
ask why we do not hallucinate whatever we remember, say our deceased 
grandmothers, when recalling them to mind. The British Empiricists’ claim that 
memories may be distinguished from percepts because they are less “vivid” does 
not really work at all, for it leaves unanswered the question of how things are 
perceived in the first place. When experiencing novel objects or events, since 
there are no memories to draw upon, memory can play no role at all in primary 


perceptual processes. 
If perception as such is neither memory nor inference, then what might we ask 


is the proper interpretation of the claim that perception is indirect? The indirect 
realist, as suggested earlier, has one option left open to him; he may assume that 
although neither memory nor inference can epistemically mediate or adumbrate 
perception, some psychological correlate of the retinal image or a brain state, say 
the sensory icon, may succeeed in doing so. 

Let us try, one last time, to give a precise meaning to the hypothesis that 
perception is indirect rather than direct. The scheme that the theoretical con- 
trast must satisfy can be minimally given as follows: Perception of X (for 
example, the world, an apple, an enemy ship, a person’s face) is not direct as it 
seems but is really indirect—that is, when looking at, listening to, or feeling for X 
we do not experience the properties of X but the properties of Y instead—where 
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Y is some physiological correlate (for example, a brain state) or physical 
correlate (for example, a retinal image) of X. 

By contrast, the direct theory of perception can now be made quite explicit; it 
simply denies the validity of the above proposition by asserting that to perceive 
X is to experience the properties of X rather than the properties of something 
else, 

The crux of the issue comes down to whether or not either party can give a 
cogent argument as to which set of properties is experienced. For the indirect 
theorist, it must be shown that our experience of the properties of X is somehow 
indirect vis 4 vis the properties of Y which are distinct from those of X; while for 
the direct theorist, it must be shown that our experience of the properties of X 
are indeed an experience of the properties of X with no other properties not 
belonging to X being experienced in their stead—that is, with no other ones 
acting as epistemic mediators. 


Perception: A Process Transparent to the Properties of the World 


The first thing to notice is that there is no serious proposition for the direct 
realist to challenge if all that the indirect realist means by the above hypothesis 
is that the properties of the mediating construct Y are the properties of X. In 
other words, the two views are only trivially different if the theory of indirect 
perception asserts that we see the world somehow through a retinal image or 
brain state in the sense that perceptual information about the world “flows” 
through them. Such a view, in essence, agrees with the direct realist’s hypothesis 
that the central nervous system, no less than air, the lens system of a telescope, 
or the lens or humors of the eye, is essentially transparent to the properties of 
the world as conveyed by perceptual information. 

The stick used to probe the shape and depth of a hole, or to feel the roughness 
or solidity of a terrain, is perceptually transparent to the shape, depth, and 
texture information picked up at the other end. Similarly, if you are offered a 
pig in a poke (a sack) to feel and you succeed in identifying its contents, then 
the poke provides a functionally transparent medium to the pig. In such a case 
we might say that the pig-in-the-poke situation affords pickup of pig-invariant, 
haptic information. 

Obviously, if the two views of realism are not to be just trivially different, then 
the properties of the perceptual referent must differ significantly from the 
properties exhibited by the hypothesized mediating construct. Otherwise, as 
William James suggests, a difference that makes no difference is no difference at 
all. On the other hand, s serious puzzle is posed for indirect realism if the 
properties exhibited by the construct should depart too radically from those 
exhibited by the object presumed known. Consequently, a necessary condition 
for knowledge of the world—that is, for any kind of realism—is that the physical 
and physiological processes which support the act of perception must somehow 
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be functionally transparent to the properties of the world. Let us explore this 
necessary requirement in more detail. 

Insofar as the properties of the construct itself are perceived, then the 
construct offers no information about its world-referent but only information 
about its own properties as an epiphenomenon of the causal process supporting 
perception. Hence, under such circumstances the construct, although a causal 
mediator, is not an epistemic mediator—that is to say, it plays no contributory 
role in perception but only a supportive one. 

Thus, we must recognize that an egregious error is committed whenever one 
asks whether or not the epistemic act of perceiving the world is causally 
supported—causally supported, yes, but no sense can be attached to the former 
question. To avoid this problem of slipping unaware between levels of dis- 
course—that is, between talk about causation and talk about knowledge—let us 
agree to distinguish two popular but equivocal uses of the term “perception”: 
perception as a causally supported, time-dependent process must be distin- 
guished from perception as an immediate, epistemic act. 


Perception-An epistemic act or causal process? The time has arrived to make 
good on the promissory note offered earlier and supply a rebuttal of the criticism 
that the direct perception of information is but another form of epistemic 
mediation since it stands between the perceiver and his world. The only proper 
response a direct realist can give to this charge is to assert unequivocally that the 
pickup and differentiation of available information brings the knowing-agent 
into direct contact with some of the significant properties of the world, namely, 
with those properties that are of considerable ecological significance. Such 
contact, however, is epistemic rather than mechanical or causal in nature. 

For instance, a mechanical or causal contrivance, say a telephone or phono- 
graph, can put us in direct epistemic contact with another person’s voice over 
distances of space or time, respectively, that are too great to allow direct 
mechanical contact betwen our bodies. If the person is a friend and the 
transmission signal is of sufficiently high fidelity, we will recognize the voice as 
familiar and, perhaps, even correctly identify its unique source. Thus, the 
transmission of the acoustic signal, as modulated originally by our friend’s vocal 
tract, provides the causal support needed to convey sufficient information for 
correctly identifying the voice heard. It is the identification which is the 
epistemic act, not the process of transmission as such. 

Indeed, it would be a category mistake to confuse the properties of the 
transmission process not directly assignable to our friend’s voice quality with 
those that are. Clearly, the ownership of the familiar vocal qualities left invariant 
by the transmission process is our friend’s rather than the electrical process as 
such. 

Similarly, we directly experience the structure of a Beethoven piano concerto 
when we hear it over the radio. The perception of the concerto is the experience 
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of the information specifying its acoustic structure as left invariant by the 
vibratory action of air molecules set in motion by the piano and likewise as 
preserved by the radio transmission process which eventuates in the speaker 
pulsations by which we hear the concerto over our radios. We should no more 
count the properties of these electromechanical media as an intrinsic part of the 
perception of the music than we should count the vibratory activities of our 
cochlear or neural transmissions. For, in hearing the music, we come into 
epistemic contact with the structure of the music rather than with the structure, 
or activities, of those processes furnishing the causal support of the music. 

In general, then, perception is the act of apprehending the properties of the 
message conveyed rather than the properties of the medium by which they are 
conveyed. In fact, we a fortiori do not experience the message when we 
experience the supporting media. Thus, we must guard against reifying the 
information experienced in perception. Information is parasitic on its source. 
Unlike the medium which causally supports the pickup of the information, by 
definition, information has no structure that does not belong to the world 
referent which is its source. Any properties of the causal process which supports 
the information not assignable directly to the environmental source are by 
definition not information but noise—that is, properties of activities or struc- 
tures that are uncorrelated with the source. 

The so-called “directness” of perception as an epistemic act rests on the 
fundamental thesis that the tissue media of the human body, no less than the 
physical media of the environment, provide only causal support for information 
about the world and, as such, are essentially transparent to most of those 
properties of the world required by organisms in order to maintain a sufficient 
level of adaptive functioning. It seems quite reasonable to assume that evolution, 
after so long a time, should have tailored the perceptual systems to be sensitive 
to the ecologically significant properties of the world rather than to the abbera- 
tions produced by dysfunctional properties of the eyeball, inner ear, neural 
tracts, or brain states. Can even a prima facie case be given for the adaptive 
significance of organisms having sensations of their organs of reception? 

No. It seems more likely that nature would have designed the perceptual 
systems of animals and humans so as to render the various media functionally 
transparent to the ecologically significant properties of the world upon which 
the survival, health, and happiness of all animals sorely depend. We might even 
expect that the sensory physiology of humans and animals, if it does any 
transforming of incoming information at all, should act to enhance rather than 
destroy the fidelity of such information. Eyeglasses, hearing-aids, and other 
prosthetic sensory devices are based on rather simple principles by which specific 
dimensions of useful information can be focused and amplified. 

Should we expect nature to be so slack in selection of evolutionary laws as to 
overlook similar ways of enhancing information that plays such an obvious and 
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important role in the survival of the species? Ample evidence suggests that the 
perceptual systems are highly specialized; but to what end if not to accomplish 
the successful transmission of important information about the world? 


Perception as process versus perception as act. To clarify the above proposi- 
tion requires that a careful distinction be drawn between perception, as a causal 
process, wherein causal mediation is logically possible, and perception, as an 
epistemic act, wherein no epistemic mediation is logically possible. Consider, 
first, what is meant by epistemic mediation. 

The thesis of epistemic mediation, as discussed earlier, suggests that one must 
know X before one can experience Y. Here X would be the epistemic mediator 
of the experience Y. For instance, we must learn the rules of chess before we can 
play chess. Here the learning of the rules mediates the experience of playing 
chess. 

Similarly, we must learn how to hold a nail against a suitable material (for 
example, wood), to grasp the proper end of a hammer, and how to swing it, 
before we can experience the act of hammering. Thus, learning to coordinate the 
proper materials, tools, and activities is epistemically necessary if we are to 
experience the desired end result. The means-end analysis of problems and tasks, 
in general, constitutes true epistemic mediation of the experienced goal state. 

But notice, it is not the rule book, chess pieces, or chess board which 
epistemically mediate the experience of playing chess, nor even the moving of 
the pieces around the board in proper response to an opponent’s moving of a 
piece. All these aspects of the game provide what Aristotle called the material 
cause (for example, board and pieces, two players), formal cause (for example, 
the rules and strategies), and efficient cause (for example, the moving of the 
pieces) of the game. These necessary components of the game of chess, along 
with the final cause, perhaps (for example, the desire to win), constitute the 
causal support of the game. In no sense, however, do they constitute epistemic 
mediators of the experience of playing the game. 

Likewise, in no sense does the physics of the energy propagated (for example, 
light, sound) from object to receptor system, nor the physiology of the neural 
transmission from receptor to cortex, constitute processes which epistemically 
mediate perception, —the act of perceiving. At best such processes only consti- 
tute causal support for perception while constituting everything that we mean 
by perception, —the process of perception. Let us unpack this distinction fur- 


ther. 


Constitutive versus preparatory experiences. To continue the analogy: 
Neither do the experiences of seeing the board, pieces, and moves made by the 
opponent, nor the experiences of planning and executing one’s own moves in 
accordance with the rules, qualify as epistemic mediators of the experience of 
playing the game of chess. For these experiences are constitutive of the game 
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experience, that is part of its content, rather than preparatory to it. Epistemic 
mediation is always preparatory to accomplishing an act but never constitutive 
of the experience of the act. 

To confuse the experiences which are constitutive of an act with the experi- 
ences only preparatory to it, is to be guilty of a category mistake—the mistaking 
of one type of thing for a thing of a different type. More precisely, it would be 
to confuse that which is directly contributory to our knowledge of the act with 
that which is only ancillary* to it. 

Consider still another example: To eat a meal we must first prepare the meal; 
but the experience of eating the meal is not the same as the experience of 
preparing the meal. The former experience is directly contributory to (or 
constitutive of) our knowledge of the meal while the latter is only ancillary (or 
preparatory) to that knowledge (although the latter experience is constitutive of 
our knowledge of preparing the meal!) 

Thus we must ask the indirect realist if the experience of the retinal image, or 
brain state, as causally determined by an object, is constitutive of our knowledge 
of the object or only preparatory to that knowledge. If, as we have supposed 
throughout, the indirect realist means the experience of such constructs to be an 
epistemic mediator of our knowledge of the object, that is preparatory to such 
knowledge, then there is no quarrel with the direct realism, for such a view 
would not preclude the possibility that we also directly experience the object. It 
would not preclude it, just as preparation of a meal does not preclude our 
consuming it. We would, however, be quite perplexed if they thought it should. 

On the other hand, if the indirect realist meant instead that our experience of 
the retinal image (or other construct), as causally determined by the object, were 
constitutive of our knowledge of it, then we would be perplexed as to how 
experience of one object (for example, the retinal image) could be tantamount 
to experiencing another unless, of course, the objects were in some fundamental 
sense the same object. Then we would feel justified in concluding that whatever 
their presumed difference it would be in no significant way a difference which 
made a difference. Thus it should be quite accurate to say that the so-called 
“mediating” construct is, indeed, functionally transparent to the properties of 
its world-referent. Again there should be little to quarrel about since there would 
not be a nickel’s difference between the two views. 

There is, however, a third possibility that the indirect realist may have in mind; 
one which we feel is actually the source of the current confusion regarding 
whether or not perception is direct or indirect. This is the claim that the 


*It is interesting to note that the Latin root of the word “ancillary” is ancillari(us), as 
relating to maidservants (ancilla). The experiences that epistemically mediate are like the 
work done by servants in preparing the food to be eaten, while the experiences of chewing, 
swallowing, etc., are constitutive of the act of eating. Thus, it is a mistake to consider the 
component experiences that are integral to a complex perceptual act as epistemic mediators 
of that perception. 
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experience of the retinal image, or brain state, is both preparatory to and 
constitutive of our knowledge of the world-referent which causally determines 
it. Let us explore this claim carefully, since it is here that a category error is 
most likely to occur. 

How might the indirect realist rationalize the claim that the experience of a 
construct is both preparatory and constitutive of our perceptual knowledge of 
the world? He might argue, as suggested earlier, that experience of the given 
construct “mediates” our knowledge of its world referent by preparing us either 
to recall previous information about it, or to make inferences as to its probable 
nature (Brunswik, 1956; Gregory, 1970; Bransford et al., this volume). Unfor- 
tunately this suggestion does not help, as the following illustration shows. 

Suppose that a round, red, fruity smelling, object is placed before us. We 
perceive these properties and recall that such properties typically specify a 
sweet, juicy, edible object called an apple. Consequently, we infer from the 
available information that the present object is also an apple and, therefore, 
quite edible. But upon biting into the object, we discover, much to our surprise, 
that it is not a real apple at all but an inedible, waxy substance that had been 
impregnated with a fruity perfume. Hence our mistake was not perceptual but 
judgmental. It is our cognitive induction which is at fault, not our perception of 
the object’s true properties. 

The moral of the above example is that memorial or inferential information is 
not necessarily adequate for distinguishing experiences, say, the edibility of real 
apples from the inedibility of wax apples—the only information that necessarily 
suffices must be perceptual in nature. A fundamental principle of epistemology 
is indicated here, namely, the principle that the information needed to dif- 
ferentiate among experiences of a certain type (for example, perceptions) must 
be constitutive of those experiences rather than merely ancillary or preparatory 
to them (for example, memorial or inferential). 

To experience the edibility of an apple as an adult there is good reason to 
believe that we must have certain general preparatory experiences as a child: We 
must learn how to grasp objects with the size, weight, shape, and surface texture 
as apples, how to bring them to the mouth, and how to bite, chew, and swallow 
them. We might call this the memorial or historical preparation for the per- 
ceptual experience in question. 

Similarly, when perceiving the subset of properties (for example, shape, color, 
size, texture, and odor) shared by the wax object and real apples, we are 
prepared on the basis of our memory to infer inductively the edibility of the 
object in front of us. However, it is only upon perceiving the information 
actually made available by our biting into and chewing the object that we realize 
our error, This is because only information constitutive of an experience (for 
example, the perceived edibility of an apple) is necessarily adequate for distin- 
guishing it from other experiences (for example, the perceived inedibility of wax 
applies). Preparatory knowledge, as provided by memory or inference, can bring 
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us to the threshold of perceptual experience, may even bid us enter, but can in 
no way participate in the knowledge drived from such experiences. 

Thus, we must conclude that, insofar as we are able to distinguish 
perceptual experiences, then we must do so on the basis of the knowledge 
accruing directly from them, and never from knowledge accruing indirectly from 
other types of experiences (for example, inference or memory). Such cases 
where inference or memory do save us from error are just such cases where they 
adequately prepare us to expect what we subsequently experience. Error in 
judgment arises when they do not adequately prepare us for what follows. 

But notice, it is necessarily the ensuing perceptual experience which is the final 
arbiter of our knowledge, since ultimately what properties are true constituents 
of the world-referent must also be true constituents of our perceptions—else the 
world is but an inferential mirage, and any form of knowledge about it, without 
any real foundation. 


Elementarism versus holism in perception. But, perhaps, there is still a way 
the indirect realist might accept the above argument and yet remain unconvinced 
that perception is an epistemically pure act—that perceptual experience is 
uncontaminated by memorial or inferential constituents. The indirect realist 
might argue as follows: It is reasonable to assume that some perceptions are 
more complex than others. Consider, for instance, the difference in perceiving a 
musical note, a melodic phrase, a movement, or the whole concerto. Or, to give 
another example, consider what is involved in perceiving a chess piece, a move of 
the piece, an opening strategy, or the whole chess game. Must we not assume, 
the indirect realist might press, that the lesser constituents of a whole situation 
are necessarily perceived prior to the perception of the whole? If so, then the 
perception of the whole requires time—a time during which the previous experi- 
ences of the lesser ingredients must be held in memory until they can all be 
accumulated and integrated into the experience of the whole. Thus the resulting 
perception, as an integration of memories, can properly be said to be epistemi- 
cally mediated by memory while at the same time being comprised of memories. 
In this sense is not memory, contrary to the earlier argument, indeed both 
preparatory to perception of the whole (for example, the concerto) and con- 
stitutive of it (for example, memory of the notes, melodic phrases, movements)? 

The answer is no, because to draw such a conclusion is again to be guilty of a 
category error—the mistaking of experiences preparatory to a perceptual act 
with the knowledge gained from experiences constitutive of the act. Or as Ryle 
(1949) put it, such a category mistake is like that committed by a young child 
who, upon seeing the marching soldiers, hearing the band, and witnessing the 
passing of the horse drawn canons, remarks with great puzzlement to his father 
that he thought he was going to see a parade—as if the parade were something 
substantively different that what was already experienced. 

The act of perceiving the whole parade consists precisely in perceiving, not 
something different from the components of the parade, as viewed in isolation, 
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but something more-the interrelationship among the components which per- 
ceptually welds them into a single event. That the successive ordering of each 
component must be remembered if the complete character of the parade is to be 
appreciated, is not at all in question—but to assume that such memories play a 
constitutive role in perception is to commit a category error. 

The perceptual act is nothing more or less than apprehension of the configura- 
tional properties of the event—the network of adjacent and successive relation- 
ships in which the individual components of the event mutually participate. 
Thus, perception is the act of experiencing that mutual participation rather than 
the memory of their successiveness. 

Indeed, insofar as their successiveness and independence are remembered as 
isolated events, to that extent they are not perceived as participating in the same 
superordinate event. Hence memory and perception are complementary and act 
in contrary directions with respect to experiencing wholeness. For the child who 
was only capable of experiencing the horse drawn canons, the soldiers, or band 
in isolation, there is no parade. We might say of such a child, if the condition 
proved chronic, that he was a victim of parade “agnosia”—the inability to 
recognize certain configurational properties of series. But notice, we would not, 
however, say that such a child were amnesic, since there would be no evidence of 
loss of memory for specific contents. 

The failure to make this distinction is such a popular source of error that it 
may be well to illustrate in several ways the complementary nature of memory 
and perception: Tasting the individual ingredients before they are combined into 
the cake batter is not to taste the finished cake. Whether or not there is accurate 
memory of each ingredient that goes into a recipe is irrelevant to the experience 
of the final product—although memory for what was done may, of course, help 
guide the preparation. However, the only knowledge constitutive of the experi- 
ence of the whole accrues at the moment of experiencing the whole. Nothing 
that precedes that experience of the final combination of ingredients can even in 
principle be constitutive of the knowledge accruing from it. 

Thus, memory of the ingredients as successively experienced is not necessarily 
sufficient to explain, or to determine, the perception of their combined presen- 
tation. Moreover, the combining is accomplished in the act of perceiving, not in 
the act of remembering. Memory can only bring successive parts closer together 
in time; it can not integrate them into a presentation to be experienced whole.’ 
What we have called perception, is the name of the process by which the 
requisite act of integration is achieved. By contrast, the experiencing of this act 
is what we have called perception, . 

Perception, can be likened to an election process—a sequence of preparatory 
steps that consumate in the installation in office of a duly elected official. 
Perceptual experience (perception2), as an epistemic act, is analogous to the 


5 My thanks to Michael Turvey for this observation. 
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consummatory aspect of such a process. Consider: The preparatory steps to an 
election involve many distinct but complex ingredients—nomination of candi- 
dates, campaigning, registering and polling of the electorate, counting the votes, 
public posting of the results, as well as a final inauguration ceremony and 
installation into office. None of these components of the election process counts 
as an election in and of itself, although they are all preparatory to its successful 
realization. What constitutes the experience of an election is both more subtle 
and more general than the experience of any of these. What constitutes an 
experience of an election is the act of realization that all the preparatory steps 
have been successfully (legally) consummated. 

Perception is itself such a consummatory act. By their very nature, such acts are 
directly experienced and thereby sufficient unto themselves as a valid source of 
knowledge. We should not say for certain that the election of an official had 
been successfully achieved unless we believed each step in the election process 
had been legally executed. Similarly, we should not experience the perception of 
a complex event unless the preparatory stages of processing had been properly 
satisfied. In the former case, a failure to satisfy the necessary antecedent 
conditions would render the election a fraud and, in the latter case, the 
perception a hallucination. 


Conclusions 


If the above arguments are valid, then there is no cogency to the argument that 
perception is indirect. Indeed, sufficient problems are encountered in attempting 
to clarify the meaning of such a claim that it seems wise to abandon the term 
altogether. However, in the wake of such problems of interpretation, it may be 
equally wise to avoid the use of the contrastive term “direct” when addressing 
perceptual issues. 

The recalcitrant issue, in any case, is not whether or not perception is “direct” 
or “indirect” but how to keep clear the distinction between perception as a 
causal process (perception,) and perception as an epistemic act (perception). 
The former concept retains all that is viable in the notion that perception 
somehow depends upon necessary physical and physiological factors (causal 
support), while the latter concept retains all that is viable in the notion that 
perceptual experiences may in some cases be mediated—those cases being where 
previous knowledge prepares one for further knowledge. 

Regarding the possible role of memory or inference in perception: any form of 
epistemic mediation to which they may contribute serves perception but does 
not contribute directly to its nature. Although there is no denying the necessary 
role of causal process as support for the act of perceiving the world, there is 
considerable reason for denying that any stage of the causal process—neither 
retinal image, neural transmission nor cerebral state—can serve as an epistemic 
mediator of perceptual experience. 
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In the final analysis, the most important thing we can say about the nature of 
perception is this: We perceive the world, perhaps, as through a glass darkly—but 
we do perceive the world. As scientists, we may attempt to infer how we might 
do so, but as perceivers no inference is involved in the doing. 
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INTRODUCTION 


For many years James Gibson has been impressed with the precision of visually 
guided behavior. In 1938 he published a study of automobile driving (Gibson & 
Crooks, 1938); during World War II he studied visual components of aviation, 
particularly the activity of landing airplanes. The priority that these studies 
reflect, and which seems to have been developing with increasing explicitness as 
Gibson’s ideas have developed, is to treat perception as a biologically adaptive 
activity first and as a study of “interesting phenomena” much later—if at all. On 
Gibson’s view, even though it is surely true that all perceptual phenomena, and 
particularly curiosities such as illusions, can tell us something important about 
how visual systems work, curiosities are probably the last things we want to 
worry about in our theories. First, theories should do justice to the everyday 
perceptual accomplishments that contribute to the survival of the species. The 
problem of guiding ourselves (cf. Gibson, 1958; Turvey, this volume), as well as 


cars and airplanes is primarily a problem of veridical perception. These are cases 
e touch with the environment. Yet the 


where perception seems to be in clos 
traditional theories of space perception available to Gibson when he was faced 


with the practical problem of understanding airplane landings and visually 
guided locomotion had little to say about such cases. Therefore he gradually 


struck out on his own. 
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THE FIRST VERSION OF GIBSON’S PROGRAM: 
PERCEPTUAL PSYCHOPHYSICS 


New Conceptions of the Stimulus 


By 1950 Gibson thought that the need was for what Troland called a psycho- 
physics of perception (cf. Lombardo, 1973, for an extensive discussion of the 
history and development of Gibson’s ideas, including his relation to Troland). It 
was believed at the time that the “givens” in the light to the eye could not 
support perceptual phenomena, but only elementary experiences such as sensa- 
tions. Organized percepts were thought to be constructed by adding order and 
meaning to the sensations which elementary stimuli were capable of evoking 
directly. Gibson reopened the case for thinking of perception as being a function 
of stimulation by offering new conceptions of the stimulus. Even though it 
seemed clear that perception could not be a direct function of the variables of 
stimulation such as frequency and intensity of light (those variables ordinarily 
controlled in psychophysical experiments), there was no reason to believe that 
perceptual experiences could not be supported by any of a wide variety of other 
patterns of variation in light. The best known candidates for being alternative, 
“higher order,” variables of stimulation relevant to perception were the gradients 
of texture and texture flow that Gibson discussed in The Perception of the 
Visual World (1950). These were shown to specify surface extent and surface 
slant (Purdy, 1958) as well as the path traveled by a moving observer (Gibson, 
Olum, & Rosenblatt, 1955). 

The new insight Gibson provided in these examples was that the stimulus for 
perception is just as much a problem for research and theory as are the 
mechanisms of perception. In fact once the appropriate description of stimula- 
tion becomes a scientific problem, one might well presume that it is logically and 
strategically prior to any detailed proposals about processing (Mace, 1975). That 
is certainly the major thrust of current linguistics and psycholinguistics 
(Chomsky, 1972). By specifying the accomplishments of a system first, one can 
gradually limit the class of plausible mechanisms for perception to those that 
satisfy the job description. For example, Gibson has argued that just as units of 
structure can be sought at many levels of analysis in the light, so perceptual 
mechanisms functioning as units might be analyzed at corresponding levels of 
analysis. Thus, as he has often argued, even though light may be a stimulus for a 
rod or a cone, patterns like gradients are stimuli for organs such as eye-brain 
systems. He has never worked out a detailed theory of the detection of a 
particular variable of stimulation, but he has not been mute on the matter of 
processing either. His suggestion that the functional’ organization of perceptual 
systems must be analyzed with respect to corresponding levels of stimulus 
patterning is a serious constraint that more detailed theories must consider 
(Gibson, 1966). Surely imposing constraints to be satisfied by possible mecha- 
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nisms is a key part of any effort to understand what mechanisms are actually at 
work. 


Environmental Constraints on Stimuli Surfaces 


The perceptual psychophysics that Gibson proposed was actually more than 
merely an attempt to specify constraints on processing models. In addition to 
arguing for perception as a function of stimulation he argued that much 
stimulation is a function of the environment—and, therefore, that it is possible 
to view perception as a function of the environment. Theories built within a 
program like this would not slight the facts of biological adaptation and useful 
perceptual activity. By considering the environment and its structure as part of 
perceptual theory the scientific task of understanding perception does not 
necessarily become easier, but the task looks more like the investigation of 
natural law and less like the contemplation of miracles. 

Surprisingly it is not uncommon for perceptual psychologists to ignore or 
downplay the important role that environment-dependent stimulus mapping 
relations play in Gibson’s theory, applauding him only for recognizing the role 
of “higher order” variables in stimulation (Garner, 1970; Hochberg, 1974). 
Apparently researchers who are willing to investigate stimulus structure do not 
recognize a need for principled limitations on the stimulus variables sought. The 
psychophysical task is simplified inasmuch as one would limit the patterns used 
in experiments to those which might correspond to selected environmental 
properties. Otherwise Gibson’s insight that “higher-order” variables of stimula- 
tion could function as stimuli for perception would have opened a Pandora’s box 
of possibilities. Researchers could wander aimlessly through long lists of patterns 
searching for those to which observers responded in experiments constrained 
only by their imaginations and the existing body of literature, possessing a 
selection strategy based solely on luck and tenacity. 

The environmental features that have occupied Gibson most are surfaces, 
especially the ground. He has maintained that what organisms see when per- 
ceiving the arrangement of the world is never “space” per se as presented in 
traditional theories, but surfaces and their interrelations. To say that one sees a 
vast expanse of space, or depth, receding far into the distance is wrong to 
Gibson. The experience should really be described in terms of the extended 
surface seen, since a person can only see an expanse of distance when presented 
an appropriate texture gradient indicating a ground surface extending to a 
horizon. By contrast, Gibson has pointed out that a person looking at patterns 
which ought to be specific to pure space, such as a Ganzfeld, give rise to little if 


any experience of space. He said, 


Spaces are determined by their surfaces.... A space is a surface; at least an environ- 


mental space always has a floor or a ground.... In general, a space is an unbounded 


surface. ... The biggest space we are capable of seeing is the surface of the terrain. ... 
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The sky, paradoxically, presents scarcely any stimulation for space perception although 
it is what psychologists have been tempted to call space [Gibson, 1959, p. 478]. 


The transitive relation of the physically extended ground to optical texture 
gradients to the perceived extended ground was the paradigmatic instance of 
stimulation as a function of the environment and perception as a function of 
stimulation, hence perception as a function of the environment. 

Through the 1950s, Gibson held to the idea that his “ground” theory of 
perception could be considered a basically psychophysical program. Since that 
time, however, Gibson has discovered his system developing beyond anything 
that could properly be called psychophysics. His is now a radical position in the 
literal sense of proposing to reformulate drastically the foundations of per- 
ceptual theory. 


THE REVISED VERSION OF GIBSON’S PROGRAM: 
AN ECOLOGICAL APPROACH TO PERCEPTION 


Gibson’s psychophysical approach to perception marked a significant departure 
from mainstream thinking on how to frame and answer questions about percep- 
tion. Few others had entertained the possibility of studying stimulus structure 
with the aim of finding correspondences between stimulus and percept, and of 
concentrating on biologically significant percepts. However, there was much in 
Gibson’s early vision which was far more traditional than the ecological ap- 
proach he is now developing. Succinctly put, the psychophysical program was 
basically a stimulus-response psychology. It was new in that it took the stimulus 
as well as stimulus—response relations as objects of investigation, but traditional 
in considering perception to be a set of responses to presented stimuli (albeit 
“higher order” stimuli). 

What is important to realize now is that Gibson does not presently hold a 
stimulus-response view in any commonly understood sense, even though he 
continues to maintain that perception of the environment is direct as opposed to 
being mediated by nonpercpetual stages of psychological processing (e.g., 
memory, inference, or imagination). To do this is what requires his current 
approach to be so radical—radical in the sense that he claims that a direct theory 
of perception is both plausible and necessary. 

His current approach is also radical in the sense that, first, it requires de- 
veloping a theory of what there is to be perceived as an integral part of a theory 
of how perceptual processing could possibly occur and, secondly, it requires a 
theory of how processing actually does occur. Gibson makes it clear in his 
current theory that one can only have direct perception if the environmental and 
organismic components of perceptual theory are compatible. Presumably they 
will be compatible only if one develops each component of the theory with an 
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eye to the other. For Gibson, one cannot realistically expect to synthesize a 
general theory of perception from patching together a theory of the physical 
world constructed by physicists who are primarily interested in the imper- 
ceptible microstructure of matter with a theory of optics developed for lens 
makers, astronomers, and microscopists with a theory of image recording de- 
veloped for painters and geometers with a theory of neural functioning devel- 
oped for communication engineers so as to yield a unified theory of adaptive 
perception for ecologically minded psychologists. 

Thus Gibson himself has done a great deal of work on the question of what 
there is to be perceived in the environment and in energy structures (for 
example, an optical array where vision is under discussion) as well as sketching 
out new ideas of how to catalogue perceptual systems in ways that mesh with his 
analyses of what there is to be perceived. Each of these components, the what 
and the how, must be considered an integral part of the same perceptual theory 
for Gibson. (For a detailed discussion of this point see Shaw & McIntyre, 1974.) If 
they do not fit together, then the structure falls and direct perception becomes 
untenable. 

Over the last 10-15 years Gibson has tried to develop enough theory in each 
of these realms to demonstrate that direct perception is indeed plausible even if 
hordes of difficult details remain to be worked out. The research and theory that 
form the content of Gibson’s program such as his analysis of the optic array, 
stimulus information, and the functional organization of perceptual systems are 
what Gibson most often points to as radical features of his work. These will be 
treated soon. 

The remainder of this chapter will first describe the key concepts currently 
holding the system together and then will examine the multiple underpinnings 
which make direct perception plausible by discussing five different ways one 
could hold an indirect theory of perception. 


MOTION PERSPECTIVE AS A CASE AGAINST 
PERCEPTUAL PSYCHOPHYSICS 


The concepts that Gibson developed which decisively distinguish his ecological 
approach from his psychophysical approach may be illustrated by referring to 
his paper on motion perspective (Gibson et al., 1955). This paper presents a 
formalization of principles Gibson had discovered from his work on aircraft 
landing, the full implications of which, however, were not pursued until after 
this paper. 

Essentially the Gibson et al. (1955) analysis of what they called motion 
perspective was a generalization of earlier analyses of motion parallax by 
Helmholtz. In motion parallax the rate of optical flow of points in a stable 
environment relative to a moving point of observation is inversely proportional 
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to the distance of the environmental points from the observer. The farther away 
the points, the slower they translate in the visual field. However, this is an 
analysis of that portion of the visual field parallel to the path of locomotion. 
Gibson noted that there is, in fact, texture flow all around the moving point of 
observation. Gibson ef al. (1955) formalized the case of rectilinear motion over 
an extended surface showing that motion parallax was a special case of their 
motion perspective. 

There are three items of special interest in their analysis. The first is the 
observation that the equations for texture “motion” not only specify relative 
distances of stable environment points, but that the path of locomotion of the 
moving point is reciprocally specified. That is, the same global transformation of 
texture can be decomposed into parts that are specific to the environment and 
to the path of whatever is moving in that environment: 


The fundamental visual perception is that of approach to a surface. This percept always 
has a subjective component as well as an objective component, i.e., it specifies O's 
position, movement, and direction as much as it specifies the location, slant, and shape 
of the surface [Gibson et al., 1955, p. 383]. 


A second important feature is that significant parameter values such as the angle 
of inclination of the approach to the surface and the point of imminent contact 
remain the same as long as the motion is uniform and the environment stable. 
They are invariant properties of the optical flow. The third fact to note is that 
the moving point of observation generates a specific texture flow with its 
characteristic invariants defined relationally among many samples of points. This 
Suggested to Gibson that the path of locomotion was probably specified every- 
where in the flowing texture—which would in turn imply that any eye sensitive 
to the crucial variables of stimulation could register these variables at different 
places in the array. On this analysis one could theoretically take various samples 
of the flowing texture and get the same surface and path information. 

What all this has in common with the psychophysical program (within which 
this particular analysis was carried out) is a concern for correspondence between 
optical texture and environmental conditions with special attention to the 
optical texture generated by surfaces. What makes it hard to fit into a psycho- 
physics is that none of the traditional interpretations of the concept of a 
stimulus seem to be involved (see Gibson, 1960, for a survey and discussion of 
these meanings). Ordinarily the notion of stimulus has indicated something that 
could be applied or presented to an animal followed by an observable response. 
But each of the three points mentioned above makes it difficult to view the 
optical flow as a stimulus that could be presented in the ordinary psychophysical 
experiment. 

First, if one were to test the sensitivity of an organism to the world events 
potentially available, there are at least two major judgments to be made instead 
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of one—the path of locomotion and the layout of the environment. To say that 
one stimulus could cause two simultaneous, complementary responses (that is, 
what Gestaltists have called a “scission” effect) is unlike psychophysics as 
traditionally practiced. More importantly, such dual specification contained in 
the changing stimulation to a moving point of observation (that is, a locomoting 
observer) indicates that a very special kind of analysis is needed to explain how 
such perceptual processing might occur. Presumably, such an analysis will be in 
terms of acts of perceptual differentiation into orthogonal components of in- 
formation—one component specifying invariant environmental properties, an- 
other specifying the observer’s place in that environment as a creature with a 
history. 

Traditional psychophysics, on the other hand, has typically been rationalized 
as the study of automatic responses to stimuli (rather than acts) which serve as 
building blocks for perception. From this viewpoint, however, the observer is 
but a passive receiver. Under such a view, Gibson argues, a perceiver would not 
be able to differentiate stimulation into its most useful dimensions. Thus, 
traditional psychophysics encounters serious difficulty in attempting to explain 
why the optical motions of an image over the retina is not intrinsically ambigu- 
ous with respect to whether the source of the optical motion is due to observer 
movement or environmental motion. Under this view it is not at all clear that 
one can explain how passive registration of a stimulation flux will allow the 
observer to extract the perceptual invariants specifying the stable layout of the 
world. 

Second, Gibson ef al. (1955) point out that crucial features of both the 
environment and the path of locomotion are specified by relations which remain 
invariant in the optical flow of texture. But how might one present an invariant 
in a psychophysical experiment? One can certainly present displays in which 
invariant relations are defined, but to regard an invariant relation as a stimulus, 
in the sense of being a “goad” which elicits behavior (Gibson, 1960), is 
incompatible with the view that the perceptual system actively separates in- 
variant information from variant information. 

The above two points are really the same, in that perceptual processing must 
decompose the information detected into subjective-objective components as 
well as variant-invariant components. If one accepts the view that important 
optical structure is to be found in the decomposition of a total structure such as 
a flow gradient, then the idea that there can be isolated stimuli which give rise to 
isolated responses that somehow become percepts (and this must be assumed by 
psychophysics if it is to be regarded as relevant to perception at all) has to be 
rejected. 

Finally, the idea that texture flow can contain structure without reference to a 
retinal projection creates a psychophysical puzzle. Ordinarily one would think 
that an observer who is sampling different segments of an optic array over a long 
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while would be getting different stimuli. Although this is certainly true in some 
sense, such a view ignores the possibility that the information specifying the 
path of locomotion and much of the environmental layout (ground plane, 
horizon, etc.) remain invariant as long as the organism is sampling within the 
same transforming array. 


Major features of the ecological approach information and the optic 
array. Since the geometry provided by Gibson et al. (1955) specifies a great 
deal about the environmental events generating it, one would think that such 
“variables of stimulation” would play an important role in perceptual theory. 
Yet, as the above discussion shows, it is difficult to see what role they might 
play in traditional views. Over the years, Gibson solved this dilemma by rejecting 
the notion of stimulus in favor of stimulus information. This latter notion 
captures all the important aspects of the foregoing example. Because many 
connotations of the term “stimulus” are misleading from Gibson’s point of view 
(Gibson, 1960), he now prefers to avoid all use of the term. Thus “stimulus 
information” is replaced with phrases such as “information contained in...” 
some specified type of array. This idea of information is a particularly central 
concept, and thus will be used to organize the full discussion of Gibson’s 
ecological approach. 

Whenever environmental events structure light (or any other vehicle of struc- 
ture if not discussing vision) in a specific fashion, Gibson asserts that the light 
contains information for those events. This use of the term information to 
indicate structure specific to its sources is a special one. It is not the same as the 
Shannon and Weaver concept of information as a measure of uncertainty more 
commonly found in perceptual psychology. By using this term instead of 
“stimulus,” Gibson hoped to avoid the muddles he pointed out in his review of 
the stimulus concept (Gibson, 1960) and retain the possibilities of direct 
perception expressed in his psychophysical program. Information was also meant 
to capture the insights illustrated in Gibson, Olum, and Rosenblatt (1955) along 
with their extensions. Hence, for the reasons already mentioned, perceiving 
based on the detection of variables of stimulation (information) such as were 
shown in that analysis could not be thought of as simple responding to physical 
stimuli. 

Information (for vision) is a geometric concept defined over a transforming 
optic array, the 360° solid angle of variations in ambient light intensity con- 
verging on a point of observation from all directions. Animals or humans do not 
enter the picture except as a scale factor for selecting appropriate environmental 
features to analyze. Thus Gibson can speak of available information in an optic 
array. Whether or not a particular observer can detect such information is 
alogically distinct question, though of course a necessary question for the 
psychologist. 
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The optic array is taken as structure surrounding a point of observation. The 
point of observation in the geometric analysis (for which the nodal point of an 
observer’s eye may be substituted in applications) is taken to be immersed in this 
array structured by environmental events. To replace the traditional image of the 
“stimulus” for vision as a picture that can be presented to an observer by the 
ambient optic array has a number of consequences. First the structured ambient 
array is always present and cannot be turned on and off the way stimuli in 
psychological experiments can be. It makes far more sense to characterize the 
basic contact of organisms with such an array as one of exploration than as a 
response. Furthermore it makes no sense to think of an organism’s responding to 
isolated bits of such an array as if they existed in hermetically sealed packages. 
Rather one would be more likely to view the perceiver’s problem as one of 
decomposition. That is, if a point is immersed in a richly nested structure of 
optical events, which of these can a particular perceiver separate out to respond 
to? Finally, the particular optic array of the terrestrial environment which is 
light above and dark below, in correspondence to sky and earth, provides the 
foundation for an absolute frame of reference within which all other event 
structure can be nested. Gibson now argues that this fundamental invariant is 
perhaps the best place to emphasize that his approach deals with real environ- 
mental space, which does have an intrinsic polarity, and not with abstract 
geometric space, which does not. In doing so he hopes to avoid the regresses and 
hopeless relativism that are implied in discussions of frames of reference in much 
perceptual literature. 

Since information refers to variables that are specific to environmental fea- 
tures, Gibson must determine what these correspondences are. Where there are 
persistent features of the environment such as the substantiality and rigidity of 
surfaces to be specified, Gibson would want to find correspondingly persistent 
features of the ambient array (invariant information). Where there are changes in 
the environment such as motions, there should be changes in array structure 
(variant information). One should note carefully that specificity and not intui- 
tive similarity is the basis for correspondence. Gibson does not expect the 
information for motion to be motion or the information for shape to be forms. 
Rather, in cases where persistent environmental properties are specified, he says 
that the information is contained in “timeless and formless invariants.” Struc- 
tures defined in terms of adjacent and successive orders of units are the types of 
things he has in mind here. Interestingly enough, this all follows from the 
assumption that the ambient optic array has a nested structure. If the point of 
observation is surrounded by densely packed structured light, then there are no 
units to move or to form the basis of shape perception. Every change in the 
array, whether induced by movements of an observer, by motions of objects, by 
changes in material composition, or by changes in shape must be regarded as a 
transformation of the whole array. To say that an object in the array moves, in 
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the example of motion, would be to presuppose the appropriate decomposition 
of the array (in terms of figure ground, phenomenal identity, and so forth) 
which is, of course, the problem to begin with. 


Information for self and the world. A central feature intrinsic to Gibson’s 
sense of information that has already been discussed somewhat is that the same 
transforming optic array not only can specify change and nonchange generally, 
but also specifies the movements and postures of the observer and the arrange- 
ment and rearrangements of the environment simultaneously. Gibson asserts that 
all information has two poles of specification, a “subjective” aspect and an 
“objective” aspect, as illustrated in Gibson et al. (1955) in particular and more 
generally in all cases of locomotion through a basically stable environment 
(Gibson, 1958; Lee, 1974). In such cases he has often suggested that the variants 
in the array specify subjective movement and the invariants specify the persisting 
environment, although this is a simplification meant more to stimulate thinking 
than to provide closure. Gibson calls the structure specific to environmental 
events exterospecific (as opposed to “exteroceptive” which is associated with 
specialized receptors) and the structure specific to the organism or point of 
observation propriospecific. These are aspects into which information can be 
decomposed regardless of its manifest form (although the focus of this discus- 
sion has primarily been on information available in optical form). Gibson 
maintains that information carried in any manner can contain both propriospe- 
cific and exterospecific information. Therefore, he sees no need for classifying 
sensory organs into those specializing in detecting states of the self and those 
specializing in detecting states of the world. To dramatize the argument against 
specialized receptors Gibson discusses “visual kinesthesis” as an example of 
obtaining propriospecific information from light (1958) and a variety of ex- 
amples of haptic shape perception to illustrate obtaining exterospecific informa- 
tion from the skin-joint system (e.g., Gibson, 1962). 


Generalizing the array concept. Throughout this chapter I am following 
Gibson’s practice of concentrating on examples from light and vision. Neverthe- 
less the principles of Gibson’s approach require that the full story of perception 
be told in more than light and sight. The terrestrial world filled with a variety of 
events contains many different embodiments of patterned energy. What each of 
these can specify must be investigated just as carefully as light. Thus one could 
analyze the specifying potential of vibrations in the air (ecological acoustics), of 
the gases dissolved in air (ecological olfaction and gustation), or of patterns of 
deformation of the skin and articular stresses (ecological haptics). One should 
recognize that these are each quite different embodiments of event structures. 
Various combinations of them co-occur. Each embodiment has its own strengths 
and weaknesses in terms of the events it can specify. For instance, chemicals in 
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the air do not seem capable of maintaining the kind of persistent detail that 
teverberating volumes of light patterns do. Opaque surfaces have somewhat 
different consequences for structuring acoustic vibrations and light. The same 
world of events should be regarded as simultaneously structuring each of the 
possible embodiments of information in a specific fashion, but the different 
capacities of the patterning media themselves constrain the aspects of the world 
that can be specified in them. 

Facts such as these should in turn have consequences for the possible evolu- 
tionary design of perceptual systems by nature and their analysis by researchers. 
That is, no one perceptual system (that is, sense modality) should be regarded as 
a privileged purveyor of truth in the Bishop Berkeley tradition. Each should be 
considered as having access to information about the same set of events simul- 
taneously. But each should also be recognized as potentially having special 
capabilities, strictly as a consequence of what can be specified in light, sound, 
smell, and so forth. Possessing multiple channels of sensitivity allows observers 
to differentiate more available information than they could otherwise. In cases 
where two or more perceptual systems happen to overlap in the information 
they are obtaining, one should regard this as redundant information and not as a 
case of enrichment or integration (where the integration implies combining units 
that were not previously integrated). Thus a cube which is seen and felt at the 
same time and is identifiable as a cube through either system, is a cube specified 
in two ways at the same time on Gibson’s view. The invariant information 
specifying the cube is presumed to be sufficiently abstract so as to be identical 
for both touch and vision. Some properties, however, may be modality specific. 
For instance, the temperature of the cube is more likely to be available to touch 
than to vision, whereas the color of the cube is more likely to be available to 
vision than to touch. 

Thus there is no claim that there are not important distinctions. Yet to the 
extent that the information of interest is that specifying geometrical shape 
Gibson would assert that it is the same across modalities. In this way the 
observer, whether animal or human, was designed to conform with the identities 
and differences that exist in the world of events rather than as an assemblage of 
message channels which have no meaning apart from the way they are combined 
(say, as implied in Bishop Berkeley’s idea that vision derives its meaning from 


touch). 


The environment. The actual investigations which fall into the area of per- 
ceptual research that Gibson designates as ecological optics have one basic goal 
in common; they have all sought to characterize the geometric structure under- 
lying the most important properties of the world, such properties as rigidity, 
nonrigidity, occluding edges, shape and size. It is important to realize that 
finding variables of an array which are specific to environmental structures 
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depends as much on selecting the appropriate environmental structures as it does 
on finding the correct “higher order” geometric variables. Gibson’s stress on the 
nature of surfaces played this role in his earlier work. Thus an equally important 
problem for Gibson’s general approach is the attempt to formulate a description 
of the environment that is compatible with theories of every array (optic array, 
acoustic array, haptic array, etc.) which can contain information about the 
environment. 

From 1966 on, Gibson has begun his account of perception with a description 
of environments, where by environment he means a description of the physical 
world which is relevant to the time and space scale of organisms. His careful 
environmental description is intended to capture the qualities of the physical 
world which have made the origin and maintenance of life possible. It is also 
meant to show that the composition and arrangement of substances, media, and 
the surfaces of their interfaces structure light (and acoustic vibrations, etc.) in 
specific ways. Thus a substance that is rigid should interact with light in ways 
that are different from substances that are nonrigid, and these differences may 
be specific. If an environment specifically structures an array, it is clear that an 
organism capable of processing this array’s structure would be in contact with 
that environment. 


Ecological optics: The current status of research. It is very important to have 
the overall coherent view such as Gibson is developing, but no matter how 
plausible or promising it appears, there is no substitute for detailed investiga- 
tions. Much of the research conducted by Gibson and his followers has been 
devoted to working out detailed examples of information. For the most part 
these emphasize geometric analyses of ecologically significant situations in- 
tended to discover possible environmentally structured array specificities. They 
have not emphasized research on characterization of the environment itself. In 
this respect this aspect of the overall enterprise which Gibson calls ecological 
optics is a continuation of the program begun as perceptual psychophysics and 
pursued primarily in the search for “higher order variables of stimulation.” The 
overall approach to perception in terms of ecology that Gibson takes is probably 
best thought of as a metatheory rather than a theory in the sense of offering 
specific, falsifiable hypotheses. The overall approach and its subcomponents 
such as ecological optics should be evaluated in terms of their fruitfulness. No 
one will ever be able to claim truth or falsity for it. However, the work 
conducted within a subspecialty such as ecological optics is full of testable and 
tested hypotheses. Some of the key steps have been taken by the following 
investigators: Gibson et al. (1955) whose work has already been discussed; Purdy 
(1958), who worked out an analysis of the correspondence between gradients of 
texture and surface slant; Hay (1966), who showed that the changing pattern or 
the shadow of an arbitrarily moving rigid plane surface was specific to the shape 
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and slant of that surface; Farber (1972), who extended Hay’s work by investi- 
gating a special case where the magnified projection of a rigidly rotating plane 
specified a nonrigid motion; Sedgwick (1973), who showed the power of the 
terrestrial horizon to specify the size of objects seen against it; Kaplan (1969), 
who investigated the role of the progressive appearance and disappearance of 
texture in specifying the occlusion of one opaque surface by another; Mace and 
Shaw (1974), who investigated the role of translatory symmetry in specifying 
the perception of one surface through another (transparent “depth”); Lee 
(1974), who provides a mathematical description of the optical flow pattern 
afforded a moving observer showing the existence of both exterospecific and 
propriospecific information; John Pittenger who, with Robert Shaw (this vol- 
ume), showed that the perceived age of faces can be explained in terms of a 
“remodeling” transformation belonging to a special geometry for nonrigid 
shapes. Only when specific proposals such as those in these studies can be made 
is it possible to construct and control stimulus conditions for perceptual experi- 
ments to see if particular organisms actually use the available information or not. 
The studies conducted with the information defined in the above work have 
generally shown excellent results, but none have established open-and-shut cases. 
They should be seen as decisive groundbreaking operations opening the way for 
a great deal of constructive work on many fronts in the future. 


FIVE WAYS TO HAVE A THEORY OF INDIRECT PERCEPTION 


Throughout most of his career a tenet of Gibson’s has been that perception must 
be direct rather than mediated by memory, inference or any other psychological 
process. It has seemed apparent to him that only a theory of direct perception 
can do justice to the facts of evolution and adaptive behavior. Yet the construc- 
tion of such a theory cannot proceed by fiat. As I have tried to show, a coherent 
theory of direct perception must have a broader scope than merely trying to 
model the mechanisms of perception. It must recognize that a theory of 
environments and a theory of the patterns of energy created by environments 
are just as much a part of a complete theory of perception as are theories of 
what organisms do. Once it is explicitly recognized that these are all mutually 
dependent components of perceptual theory, then they can be developed in 
compatible ways. Only as they are developed in compatible ways can direct 
perception make sense. I suspect that much of the consistent criticism Gibson 
has received for holding a direct theory of perception (e.g., Gregory, 1972; Gyr, 
1972; Johansson, 1970) stems from not recognizing how thoroughly comprehen- 
sive such a theory must be. To further clarify the multiple foundations of 
Gibson’s direct perception, let us examine what theorists seem to have meant by 
indirect perception. Contrary to the dichotomy implied in the direct-indirect 
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contrast, there seem to be at least five common grounds used to support the 
claim that perception is indirect. Holding any one of them would be sufficient to 
make a person an indirect theorist. Consequently, to hold a direct position such 
as Gibson’s requires an alternative to each one. 


What Structures Can Count as Stimuli? 


Any theory of perception presupposes some set of structures that are detected 
by the perceptual processes rather than constructed. Early theorists claimed that 
perception was based on point sensations in a frozen moment of time. Currently 
more complex entities such as oriented lines are being proposed as directly 
detectable structures. Since perception results in the experience of events that 
are spatially unified and exist continuously over time, a theory whose basic 
structures of stimulation do not have these qualities must interpose compound- 
ing or constructive mechanisms to build unified percepts. This integration of 
substructures is an intervening process thought to be different from the end 
result of perceiving. Such an approach represents one way that a theory of 
perception can be indirect. Helmholtz (1925) was quite explicit on this point: 
“A direct image of a portion of space of three dimensions is not afforded either 
by the eye or by the hand. It is only by comparing the images of the two eyes, 
or by moving the body with respect to the hand, that the idea of solid bodies is 
obtained [p. 23; italics added] .” 

Gibson, on the other hand, holds that such a “piecing-together” description of 
stimulus processing is not necessary because structure in the stimulation itself 
consists of spatial and temporal relationships. For him, structure is patterned 
discontinuity in an array. Thus a pattern of regularly appearing or disappearing 
texture relative to some point in the environment can be thought of as an 
instance of structure in the optic array defined over time. Such regularities or 
invariants which are defined only over change are common in mathematics, 
being at least as old as the derivative, and should not be ignored in descriptions 
of stimuli available for perception. It is important to realize that Gibson’s 
definition of structure is sufficiently abstract to apply to media other than light, 
indeed, any medium that can preserve any pattern of discontinuities over time. 
Since order and change of order are not modality-specific characteristics, Gib- 
son’s definition makes it possible to say that the same stimulus structure is 
equally available to several senses. For example, the adjacent order of the 
vertices of a cube are an obvious aspect of its physical structure. This physical 
adjacent order will produce corresponding adjacent orders both in the pressure 
patterns to a hand that grasps the cube and in the light patterns to an eye that 
looks at it. 

From the viewpoint of Gibson’s dynamic approach to stimulus analysis, the 
observation that organisms perceive events structured over time no longer 
provides sufficient reason to conclude that perception is an indirect, constructive 
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process which adds a dimension of temporal integration not to be found in raw 
impinging stimuli. Since dynamic structures may count as stimuli, perceptual 
processes using them may be referred to as direct detection processes, which 
occur over time because the stimuli being processed are defined over time. Thus, 
Gibson has overcome one obstacle to a direct theory of perception, or the 
notion that givens or data are momentary slices in time, by changing what can 
count as structural givens in stimulation to include regularities of change. 

A consideration of current single cell “detector” mechanisms may help to 
dispel the notion that Gibson’s ideas about direct response to complex relation- 
ships are abstract speculations that cannot be physically realized.’ The per- 
ceptual apparatus of amphibians and mammals is somehow constructed so that 
certain central cells respond to relationships that are defined in terms of space 
(line detectors) and in terms of space-time (motion detectors). Their existence is 
thoroughly documented. Recently, Gross, Rocha-Miranda, and Bender (1972) 
have reported “monkey paw” detectors in the inferotemporal cortex of Macaca 
mulatta. None of the investigators in these areas seems to have felt a need to 
explain the action of such cells by referring to comparisons of current input with 
stored images or integrating momentary images over time with constructive 
operations. Detector cells should be considered to be examples of physical 
systems which are organized (“tuned” in Gibson’s terminology) to give specific 
responses to specific relations in light. These detectors do not, of course, prove 
the existence of mechanisms that respond directly to the more complex relations 
that Gibson discusses. However, the single cell work does provide an analogue to 
Gibson’s notion of direct perception of structured events, which in turn makes 
the idea plausible and shows that it cannot be dismissed on a priori grounds. 


Does Information Exist? 


As already discussed, stimulation is said to contain information if its structure is 
specific to the environmental sources of that structure. Only if stimulation 
contains such specific structure could it specify its sources, and only if it 
specified its sources could the detection of such structure be said to be direct 
perception of these sources. The possibility of direct contact with an environ- 
ment thus depends heavily on the existence of information in Gibson’s sense. 
Indeed, ecological optics, the theoretical and empirical investigation of the 
information available in light has occupied most of the time of Gibson and his 
students. If information in Gibson’s sense did not exist, then intervening 
steps in perceptual processing such as hypothesis constructing and testing are 
necessary. Assuming the idea that the structure of stimulation does not specifically 


! The observations in this paragraph are primarily those of John B. Pittenger, Department 
of Psychology, University of Arkansas at Little Rock. 
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correspond to any of its sources, then, requires that perception be indirect. Gibson 
has avoided this lure into mediation by developing his concept of information. 


The Senses Considered as Perceptual Systems 

Traditionally the “senses” have been regarded as separate input channels funnel- 
ing messages to a central processor which must compare the inputs coming in 
from each in order to make unitary decisions about the current state of the 
world. To the extent that perception is a product of putting together these 
messages from separate channels, perception must be regarded as indirect or 
constructive. Richard Gregory (1969) has suggested that perception of visual 
patterns is indirect because “what matters is whether the object is useful, a 
threat, or food. It is non-optical properties that are important [p. 245].”” And 
later he asserts: “To build a seeing machine, we must provide more than an ‘eye’ 
and a computer. It must have limbs, or the equivalent, to discover non-optical 
properties of objects for its eyes’ images to take on significance in terms of 
objects and not merely patterns [p. 246].” In other words, our skin and body 
have certain privileges not enjoyed by the visual system—namely that patterns of 
deformation of the skin created by objects and patterns of muscle movements 
created, say, by walking toward an object are directly meaningful. 

This view goes back at least to Bishop Berkeley and is apparently forced by the 
assumption of meaningless punctate stimulation for vision. Interestingly enough 
it is not supported by a shred of empirical evidence or coherent theory. For 
instance, if one arranges conflicts of information available to various modes of 
attention, the conflicts are always resolved in favor of vision (e.g., Pick, 1970). 
Or imagine some of Gibson’s favorite key cases, a fish swimming upstream or a 
bird flying against the wind. How could nonvisual criteria such as effort ex- 
pended or muscles recruited have any effectiveness for guidance in these cases? 
Could it really be very convincing to try to account for these kinds of phe- 
nomena by saying that somehow organisms learned to trust vision gradually after 
first not trusting it? How could this work? It is really quite difficult to see why 
the pupil would always outstrip its teacher in perceptual conflict situations. 
What would guide such a shift of control? 

For Gibson, the difference between meaningful and meaningless stimulation 
has nothing to do with whether or not stimulation across modalities is inte- 
grated. As described above, there is information differentially available to the 
various sense modalities. Hearing is better than seeing when one is interested in 
events taking place behind a nearby opaque surface; though seeing is certainly 
better than hearing in determining the opacity or shape of a surface. Touch is 
clearly better at perceiving temperature differences. And so it goes. Rather than 
having these be isolated systems to correlate, however, Gibson would have each 
of these modes of attention (each of which has the motor resources of the body 
available to it) sampling the same structured world. When they are sampling the 
same events, there would rarely be contradictory information—though the in- 
formation could be richer as suggested in the examples immediately above. 
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In summary, for Gibson, the senses operating as perceptual systems are all 
sensitive to information about the entire environment. Each mode of attention 
has its own special capacity for detecting information but these specialties 
reflect the nature of the embodiment of world structure as well as the nature of 
the detecting system. 


Affordances 


Gibson’s concept of affordances, that is, information specifying the adaptive 
value of objects or events for organisms, is an important result of con- 
sidering the senses to operate as perceptual systems rather than uncoordi- 
nated sensory channels. (See Gibson, this volume.) As indicated above, Gregory 
and others often seem to give special status to nonoptical stimuli. From Gibson’s 
viewpoint these arguments are seen to rest upon a very narrow set of examples 
that ignore the important role of visual perception in survival and adaptation. 
Although it is true that no organism has ever been killed or maimed by a purely 
optical event, say by just seeing a club swung, successful avoidance of harm may 
require seeing it swung in time to dodge. Try hiding in a glass phone booth to 
avoid being seen by a mugger or using a cellophane fig leaf to avoid public 
embarrassment. Consider the indispensable role of optical information for trans- 
parency (e.g., in seeking water), opacity (e.g., in avoiding bumping into objects), 
coloration (e.g., in selecting ripened fruit) and patterns (e.g., in using camouflage 
to avoid predators) before accepting Gregory’s or Bishop Berkeley’s exclusive 
appeal to nonoptical properties of stimulation in order to give meaning to visual 
information. Indeed the felt effects of being caressed, kissed, clubbed or burned 
derive as much meaning from seeing the initiating source as the proper interpre- 
tation of the source depends upon its felt effects. Full meaning of such events 
arises from the systematic coordination of all sense modalities. This coordination 
is made possible by the fact that the spectra of energy forms appropriate to each 
sense modality have a source (the object or event) which possesses a unitary 
structure as von Hornbostel (1927) so well knew. 

Gibson’s notion of affordance not only allows one to describe the environment 
from a point of observation, but does so with respect to a particular observer, 
taking into account the observer's size, form, and capabilities. Affordances 
constitute a partitioning of an environment with an organism in mind instead of, 
say, the more neutral partitioning of the environment of energy flux into 
observer-independent properties by classical physics. 

Some typical examples of affordance descriptions of environmental properties 
are walk-on-ability, grasp-ability, injury, collision and nutrition. One says that 
environmental properties afford the above activities; for example, a coffee cup at 
room temperature affords grasping by humans. Although defined relative to an 
organism, affordance relations exist independent of conscious experience or any 
subjective states of an organism. A persistent surface which is strong enough to 
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hold the weight of an animal can be said to afford support for it whether the 
animal is in a state of realizing it or not. 

Without this concept one could say that even if Gibson’s position on the 
previous three points were accepted, perception of the world could nevertheless 
be indirect because an organism detecting Gibson’s higher order invariants would 
still have to make a connection between those invariants and properties of the 
world which were useful to him. Not only would there be this step, but there 
might also be a step from a property to what the organism could do with it. For 
example the animal might have to connect the invariant structure of a perceived 
pattern with a property such as “hardness” or “opacity” and then from these 
infer what activities it can perform. The concept of affordance makes the last 
step unnecessary. It says that “hard” and “opaque” are no longer the descrip- 
tions of any properties whose invariants we're seeking. Rather we are seeking 
optical structure (where vision is concerned) that corresponds to what can be 
seen through, hidden behind, hammered with, and the like. An affordance 
partitioning of the world would be very different from one based on “proper- 
ties” defined with respect to events that do not involve organisms directly. 

A full appreciation of what is entailed in an affordance analysis also removes 
the need for talking about the other “step” in perception, that from computing 
the input structure to finding the environmental property to which it cor- 
responds—even where the environmental property is expressed in affordance 
terms. This is the step in Gibson’s direct perception theory which is hardest to 
comprehend and even harder to accept. Johansson (1970) expresses typical 
reservations. He argues that there must be some kind of code allowing the 
organism to infer the nature of the world around him from the structure in the 
light which he detects. A more internal threat to the coherence of the theory of 
direct perception which I will deal with sneaks in through Gibson’s typical 
examples of affordances. Take “graspability.” To say that an organism sees the 
graspability of an object could imply that it had to make a cross-modal 
correlation à la Gregory and lead right back to a version of constructive theory. 

Both of these possible Trojan horses can be dealt with through a more careful 
consideration of what a thorough affordance description of an environment 
would look like. Instead of “senses” Gibson speaks of modes of attention. Let us 
examine vision as a mode of attention relative to which affordances can be 
described. By sticking to one mode of attention I hope to avoid raising the issue 
of cross-modal correlation for at least a little while, since it is something of a red 
herring. 

Imagine an organism in an open field looking to the horizon. The optic array 
consists of a light upper portion and a darker lower portion, the sky above and 
the earth below. The gradient of the earth’s texture can be thought of as 
specifying what optical transformations are afforded in this particular optic 
array for this particular organism with its size, shape, and exploratory capacities. 
In such a case, all optic array transformations obtainable by displacement would 
be specified. Suppose there was an indistinct object near the major light-dark 
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transition in the array (the horizon). How might the visual system magnify part 
of its optic array so as to clarify the indistinctness and to identify the relative 
permanence in the overall structure? By using its legs, of course! 

This type of description can be used to illuminate what Gibson means by 
perceptual systems as well as affordance structure. For him, vision is a system 
capable of actively exploring the environment precisely because it is a subsystem 
of a more complex system, the human body; as a subsystem evolution has 
attuned vision to work in a well integrated fashion with other subsystems (for 
example, the motor system). In this sense it is the whole organism which “sees” 
and thus which can use its legs to carry out visual investigations of the 
environment. The whole body can be mobilized in the interests of visual 
exploration and, for Gibson, counts just as much as a part of the visual system 
(when being used for visual exploratory purposes) as the eyeball itself. From this 
point of view, it is to be expected that so-called motor sections of the nervous 
system should be involved in perceiving. It should also be expected that careful 
consideration of the motor nervous system would ultimately equivocate on the 
very existence of a firm distinction between the motor and sensory aspects of 
the nervous system when considered with respect to perceiving. (See Turvey, 
Chapter 9, this volume.) 

When Gregory said it was nonoptical properties of the world that really 
mattered, he did not consider whether or not there might be specific optical 
correlates for the events he had in mind. Expanding stimulus patterns which 
specify looming objects, for example, could again be considered from the optical 
viewpoint as well as from the point of view of mechanical contact with the skin. 
Thus one could say with justification equal to Gregory’s that a symmetrically 
expanding pattern which fills the entire visual field is one to be avoided. Surely 
it makes at least as much sense to say that the optical event which specifies 
collision should be avoided as to say that the tactile events corresponding to 
collision are what the organism is avoiding (in the cases where it actually does 
avoid the potentially injurious event). Indeed, there is evidence that animals will 
attempt to avoid optical looming, as if experiencing the imminence of an 
impending collision, although no tactile stimulation is possible (Schiff, Caviness, 
& Gibson, 1962). Information for occluding surfaces, supporting surfaces and 
open vistas is information for possible paths open to the visual system for 
clarifying the optical structure of the environment. Thus, a very rich partitioning 
of the whole environment can be referred to an affordance analysis of vision 


alone (or any other mode of attention). 


Second-Hand Perceiving 


A fifth sense of indirect perception is a literal one which Gibson himself 
has often discussed—seeing at second hand, as in pictures. When one looks 
at a photograph or a representational painting or a movie, one is 
looking at a segment of someone else’s optic array. Though sharing some 
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similarities with a natural optic array, these displays are not the same as a full 
optic array, and, therefore, cannot be explored in the same ways. They do not 
contain all of the same information and, consequently, should be analyzed 
separately. This is not to say Gibson is not interested in such forms of indirect 
perception. In fact he has written often about the problem of perceiving pictures 
(e.g., Gibson, 1971) and certainly has no prejudice against the study of indirect 
perception when it is construed as the perception of someone else’s view. 
Obviously interest in this problem in no way compromises his belief in the 
notion of direct perception. 

On the other hand, the belief that the retinal image is the stimulus for vision 
would indeed make perception indirect. If what is really perceived were the 
retinal image projected from the world rather than aspects of the world itself, 
then perception would be second hand, and hence indirect; moreover, under this 
view seeing a picture would be an account of how one normally sees the world 
instead of being just an account of a very special case of seeing, as Gibson would 
have it. Gibson’s objections to the idea that we see the world by means of retinal 
images are numerous. Here are five: 


1. An observer for the image is implied and this observer’s perception must be 
explained. Such homunculus explanations lead to infinite regresses. 

2. No theory is provided to explain what aspects in the organization of the 
image are sufficient to account for visual perception. All an image can account 
for is the fact that some organization from the world is faithfully mapped on to 
physiological structures. Since this organization, whatever it might be, must be 
further mapped throughout the nervous system, all that is asserted is a causal 
chain. There seems to be no compelling reason to claim that the “ultimate per- 
ceiver” of the nervous system is in any more direct contact with the retinal 
image than with the patterns in the environment. Thus, the so-called “mind- 
body” problem raises its ugly head. 

3. One gets into hopeless muddles with respect to problems of orientation. 
The image is often referred to as upside down or left-right reversed. But such 
judgments must be made with respect to some frame of reference. The ultimate 
frame of reference of earth and sky cannot change in normal perceiving. They 
cannot be upside down. Yet when considered as a picture in the head which is 
upright in the world the image is upside down. Many theorists have taken this to 
indicate that the world should be perceived as upside down. That is part of the 
retinal image-as-stimulus assumption. However it assumes that the observer is 
well-oriented in order to notice that the image is inverted. Although Gibson does 
not claim to have a fully articulated theory of what happens in optical inversion 
experiments, he is nevertheless quite sure that reference to the retinal image is 
not helpful (compare Gibson, 1964, for a discussion of Kohler’s prism studies). 

4, There are organisms such as the horseshoe crab which functionally per- 
ceive and appear to be sensitive to much the same information that is detected 
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by other organisms (Schiff et al., 1962), but they have no retinal images because 
they have no retinas. 

5. One never supposes that there are auditory or haptic images which are 
the real objects of perception. One does not say that the skin is in indirect con- 
tact with substances it touches. Gibson sees no reason why variant and invariant 
information and their detection should not be thought of as examples of the 
same types of decomposition process in each mode of attention. (An excellent 
philosophical argument against claims that images are the direct objects of 
perception may be found in Austin, 1962.) 


Above are five different ways that an indirect theory of perception might be 
held. There are surely more. However, these points illustrate that Gibson’s view 
that perception is direct apprehension of many important aspects of the environ- 
ment is a reasonable alternative to the indirect view which is fraught with logical 
weaknesses. On the other hand a proponent of any of the indirect positions 
bears the considerably heavier burden of showing how such a view is compatible 
with both evolution and adaptive behavior. 


CONCLUSIONS 


In my opinion, the fruitfulness of Gibson’s research project is well established. 
Insofar as it has been based upon a direct theory of perception this provides a 
strong argument that such a theory is both logically consistent and empirically 
sound. However, though thoroughly sketched now, the approach is far from 
complete. It should be recognized that what Gibson said of his psychophysical 
theory in 1959 is also true of his current ecological theory: 


The theory offered is immature in the sense that the program of investigations called for 
has only begun. It is also immature in that its potential scope seems to be wider than the 
scope of the problems to which it has been applied. 

...The theory has been extraordinarily fruitful in suggesting to the author hypotheses 
for experiments and in opening up new ways of experimenting on old problems. The 
important question is whether it will serve the same function for others [Gibson, 1959, 


p. 499]. 


A full appreciation of Gibson’s theory requires a careful review of over a 
quarter of a century of work. Nothing less suffices to give a clear picture of the 
breadth and depth of his thinking about the most difficult problems in the 
theory of knowledge that psychology must face. His work is truly a significant 
exercise in experimental epistemology. There can be no doubt, however, that 
many of his views are quite radical, strikingly so if we look at Koffka’s statement 
of the problems of perception. He asked “Why do things look as they do?” After 
rejecting as utterly ridiculous the answer “because they are what they are,” and 
as reasonable but wrong “because the proximal stimuli are what they are,” 
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Koffka settled on an answer appealing to organized brain processes set up by 
proximal stimuli. In his psychophysical program, Gibson showed that by recon- 
ceiving the stimulus structure one could make a good case for the second of 
Koffka’s answers. One might say “things look as they do because the proximal 
stimuli are what they are—we just looked at the wrong proximal stimuli.” But 
now, with his ecological program, reconceptualizing the physical environment as 
well as the stimulus, Gibson is suggesting that what was utterly ridiculous might 
be true. We may be able to change Koffka’s question to the stronger “why do 
things look as they are?” and seriously answer “because they are what they are.” 
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The Theory of Affordances' 


James J. Gibson 


Cornell University 


A description of what the environment affords the animal can be given in terms 
of a list beginning with simple and ending with complex things. Such a list 
includes features of the terrain, shelters, water, fire, objects, tools, other animals, 
and human displays. In addition, the information that is available in ambient 
light for the perception of substances, their surfaces, and the layout of these 
surfaces must also be described. An attempt should also be made to connect the 
two, to show that the variables of substances and layout combine to make 
affordances for animals and to demonstrate that the optical information for 
perceiving the variables combines to yield information for perceiving the afford- 
ances. What is being attempted is an explanation of how the “values” or 
“meanings” of things in the environment could be directly perceived. 

What is meant by an affordance? A definition is in order, especially since the 
word is not to be found in any dictionary. Subject to revision, I suggest that the 
affordance of anything is a specific combination of the properties of its sub- 
stance and its surfaces taken with reference to an animal. The reference may be 
to an animal in general as distinguished from a plant or to a particular species of 
animal as distinguished from other species. Note that the properties of substance 
and surface are physical properties but that they are not described in classical 
physics, only in ecological physics. The combination of properties is uniquely 
related to the animal or species being considered. It is assumed that if the 
properties of substance and surface are given in light the combination is given, 
and hence that if the properties are perceivable the special set of properties will 
be perceivable. In fact we can entertain the hypothesis that the affordance may 
be more easily perceived by an animal than the properties in isolation, for the 


!'This is a preliminary version of a chapter from a forthcoming book entitled An 
Ecological Approach to Visual Perception to be published by Houghton-Mifflin Co. 
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invariant combination of properties is “meaningful” whereas any single property 
is not. 

The affordances of the environment are what it offers animals, what it 
provides or furnishes, for good or ill. Let us consider two examples, the first 
being an affordance for terrestrial animals in general and the second being an 
affordance for man in particular. 

If a substance is fairly rigid instead of fluid; if its surface is nearly horizontal 
instead of slanted; if the latter is relatively flat instead of convex or concave; and 
if it is sufficiently extended, that is, large enough, then it affords support. More 
particularly it affords support to large animals who would sink into a surface of 
water, or in a swamp. It is a surface of support, and we call it a substratum, 
ground, or floor. It is stand-on-able, permitting an upright posture for quad- 
tupeds and even for bipeds. Thus it may also be walk-on-able. If there is optical 
information for the four properties listed, rigidity, levelness, flatness, and ex- 
tendedness then the affordance can be perceived if the information is detected. 

The next example is more particular. If an object that rests on the ground has 
a surface that is itself sufficiently rigid, level, flat, and extended, and if this 
surface is raised approximately at the height of the knees of the human biped, 
then it affords sitting-on. We call the object a seat, stool, bench, or chair. It 
affords support for the rump, whether or not it affords support for the back. If 
these five properties coexist the object is in fact sit-on-able; they combine to 
yield a higher-order property for the human observer. The object may then be 
perceived as sit-on-able without much attention being paid to the five properties 
in isolation. Note that knee-high for a child is not the same as knee-high for an 
adult so that sit-on-ability must be taken with reference to a subclass of the 
human species. The surface layout may be a natural seat like a log or a ledge or 
an artificial seat like a chair or a couch; the affordance is the same. Note that 
some properties like the color and texture of the surface are irrelevant to the 
fact of being a seat, and that other properties only determine what kind or 
subclass of seat it is, stool, bench, chair, etc. 

Now just as surfaces are stand-on-able and sit-on-able so also are they bump- 
into-able or get-underneath-able, or climb-on-able, or fall-off-able. Different 
layouts afford different kinds of behavior and different sorts of encounters, 
some beneficial and some harmful. I tried to classify these offerings and 
opportunities of the layout but the classification should now be enlarged upon. 

Moreover the objects of the environment afford activities like manipulation 
and tool using. The substances of the environment, some of them, afford eating 
and drinking. The events of the environment afford being frozen, as in a 
blizzard, or burned, as in a forest fire. The other animals of the environment 
afford, above all, a rich and complex set of interactions, sexual, predatory, 
nurturing, fighting, play, cooperating, and communicating. What other persons 
afford, for man, comprise the whole realm of social significance. We pay the 
closest attention to the optical information that specifies what the other person 
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is, what he invites, what he threatens, and what he does. For each of these kinds 
of affordance the question we must ask is, how is it perceived? First, what is the 
stimulus information to specify it and, second, how is the information picked 
up? 


THE NICHES OF THE ENVIRONMENT 
AND THE REALITY OF AFFORDANCES 


Environmental scientists, ecologists, make use of the concept of a niche. A given 
species of animal is said to utilize a certain niche in the environment. It is not 
the same as the habitat of the species, that is, where it lives, but rather how it 
lives. I suggest that a niche is a set of affordances. The natural environment 
offers many ways of life and a way of life is a set of affordances that are utilized. 

The reciprocity of animal and environment is implied by this theory for the 
niche implies a certain kind of animal and the species implies a special niche. But 
the independent existence of an unlimited environment is also implied, for the 
niches must be available before animals can begin to exploit them. The affording 
of life by the environment is presumably of unlimited richness and complexity. 
The physical, chemical, meteorological, geological and geographical conditions 
of the surface of the earth, and the preexistence of plant life, are what make 
possible animal life. They have to be invariant or persisting for animals to evolve. 

The environment affords many different kinds of food and many different 
ways of getting food. It affords various sorts of preexisting shelters or places to 
hide, in holes, crevices, and caves, and various materials for the making of 
shelters such as mounds, nests, and huts. It affords various kinds of posture like 
floating, clinging, resting, and standing, and various kinds of locomotion like 
swimming, crawling, walking, climbing, and flying. These offerings have all been 
taken advantage of, which is to say that the niches have been occupied. But, for 
all we know, there may be many offerings of the environment that have not 
been taken advantage of, that is, niches not yet occupied. 

Architecturally speaking, a niche is a place that is suitable for a piece of 
statuary, that is, a place into which the object fits. The metaphor is interesting. 
Ecologically speaking, a niche, although not literally a place, is a setting of 
environmental features that are suitable for the animal, and into which it fits 
metaphorically. 

The concept of the niche emphasizes an important fact about affordances, 
namely that they are real. Although an affordance consists of physical properties 
taken with reference to a certain animal it does not depend on that animal. In 
this respect an affordance is not like a value which is usually supposed to depend 
on the observer nor is it like a meaning which is almost always supposed to 
depend on the observer. An affordance is not what we call a “subjective” quality 
of a thing. But neither is it what we call an “objective” property of a thing if by 
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that we mean that a physical object has no reference to any animal. An 
affordance cuts across the dichotomy of subjective-objective and helps us to 
understand its inadequacy. The affordances of the environment are facts of the 
environment, not appearances. But they are not, on the other hand, facts at the 
level of physics concerned only with matter and energy with animals left out. 

The niche for a certain species should not be confused with what some animal 
psychologists have called the phenomenal environment of the species. This can 
be taken erroneously to be the “private world” in which it is supposed to live, 
the “subjective world,” or the world of “consciousness.” I will argue that the 
behavior of an observer depends on his perception of the environment, surely 
enough, but that this does not at all mean that his behavior depends on a 
so-called private, or subjective, or conscious environment. The organism depends 
on its environment for its life but the environment does not depend on the 
organism for its existence. 


MAN'S ALTERATION OF THE NATURAL ENVIRONMENT 


In the last few thousand years, as everybody now realizes, the very face of the 
earth has been modified by man. This means that the layout of surfaces has been 
changed, by cutting, clearing, leveling, paving, and building. There are still 
natural deserts and mountains, swamps and rivers, forests and plains, but they 
are being encroached upon and reshaped by man-made layouts. Moreover the 
substances of the environment have been partly converted from the natural 
materials of the earth into various kinds of artificial materials like bronze, iron, 
concrete, and bread. Even the medium of the environment—the air for us and 
the water for fish—is becoming slowly altered despite the restorative cycles that 
yielded a steady state for millions of years prior to man. 

Why has man changed the shapes and substances of his environment? So as to 
change what it affords him. He has made more available what benefits him and 
less pressing what injures him. In making life easier for himself, of course, he has 
made life harder for most of the other animals. Over the millenia he has made it 
easier for himself to get food, easier to keep warm, easier to see at night, easier 
to get about, and easier to train his offspring. 

This is not a new environment, an artificial environment, distinct from the 
natural environment, but the same old environment modified by man. It is a 
mistake to separate the natural from the artificial as if there were two environ- 
ments. Artifacts have to be manufactured from natural substances. It is also a 
mistake to separate the cultural environment from the natural environment, as if 
there were a world of mental products distinct from the world of material 
products. There is only one world, however diverse, and all animals live in it, 
although we human animals have altered it to suit ourselves. 
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The fundamentals of the environment, the substances, the medium, and the 
surfaces are the same for all animals. No matter how powerful men become we 
are not going to alter the fact of earth, air, and water, the lithosphere, the 
atmosphere, and the hydrosphere, together with the interfaces that separate 
them. For terrestrial animals like us the earth and the sky are a basic structure 
on which all lesser structures depend. We cannot change it. We all fit into the 
substructures of the environment in our various ways for we were all, in fact, 
formed by them. We were created by the world we live in. 


FURTHER EXAMPLES OF AFFORDANCES 


The theory is that, although the environment consists of substances, surfaces, 
and the medium at one level, it consists at another level of affordances for 
animals. The substantial properties and the shape properties combine to make 
properties of higher order. The latter are not as easily analyzed by chemistry and 
geometry as substance and shape are but they are just as real. Let us consider 
some additional examples. 


What Do Substances Afford? 


Recall the formula that air, water, and earth or, more generally, the gaseous, 
liquid, and solid state, are increasingly substantial. Then note that air, although 
insubstantial, affords breathing because of its oxygen. It also affords unimpeded 
locomotion. When the air is illuminated it affords visual perception, being 
transparent. Water, more substantial than air, affords drinking. But for us it does 
not afford breathing but drowning. Being a solvent it affords bathing and 
washing. Being fluid it affords pouring from one vessel to another. A surface of 
water will be considered later; only note now that it does not afford support for 
heavy animals—only for waterbugs. 

Solid substances, being still more substantial, afford all sorts of physiological 
and behavioral activities. Certain of them afford eating, more exactly ingestion, 
and of those that afford ingestion some afford nutrition as against others that do 
not. Some few in fact afford the opposite of nutrition, poisoning. (Note that I 
say nothing here about what affords pleasure in eating; that is another matter 
entirely.) Whether or not a vegetable substance affords nutrition depends upon 
the biochemical state we call ripeness, and this often is specified by the color of 
its surface. Solids also afford various kinds of manipulation or manufacture 
depending on the kind of solid state. Some, like flint, can be chipped; others, 
like clay, can be molded; still others recover their original shape after deforma- 
tion; and some resist deformation strongly unless smelted by fire. Manipulation 
and manufacture are forms of behavior that are mostly but not exclusively 
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characteristic of primates—not exclusively, since wasps and birds manufacture 
nests. 


What Do Surfaces and their Layouts Afford? 


Passing from the substances of the environment to their surfaces, consider what 
the “shapes” of surfaces afford, by which is meant the solid geometrical shapes, 
or what I have called their layout. I said that a solid, level, flat, extended surface 
affords support and constitutes a ground for a terrestrial animal. He can stand on 
it and maintain equilibrium, or come to rest on it and maintain a fixed posture 
with respect to gravity, gravity being a force perpendicular to the ground. He 
does not fall or slide as he would on a cliff or a steep hillside. Note that 
equilibrium and a stable posture are prerequisites to other forms of behavior 
such as locomotion and manipulation. The ground is literally a basis for behav- 
ior, and also a sort of basis for visual perception, as I maintained in what I once 
called the “ground theory of space perception” (Gibson, 1950). If this is true 
the physical-geometrical features of the ground and its affording of support to a 
terrestrial animal do not belong to separate realms of discourse; they are one and 
the same. Geometry, in the last analysis, is connected with life. 

If the ground is lie-on-able and stand-on-able it is also walk-on-able and 
tun-over-able. It affords locomotion. For an animal with feet, it affords what we 
call “footing” although this depends on the absence of foot-sized obstacles like 
loose rocks, and the absence of slipperiness caused by the presence of banana- 
peels or smooth ice. Hikers need to pay attention to the footing. 

The terrestrial earth, of course, is seldom solid, level, flat, and extended all the 
way out to the horizon. It is “cluttered.” Usually there are features of the 
terrain with which the flat earth is furnished. Deferring consideration of rela- 
tively small detached objects for the moment, let us list the terrain features that 
do not afford pedestrian locomotion but require other kinds. They seem to be 
surfaces of water or of watery earth, slopes upward of varying steepness to the 
maximum of a cliff wall, slopes downward of varying steepness to the maximum 
of a cliff brink, and finally simple obstacles. 

A surface of water like a stream or a pond affords only special sorts of 
locomotion, swimming or wading, for which the animal may or may not be 
equipped. The same is true of a swamp. A slope upward begins to require 
climbing when steep, and a wall may be unclimable although a small wall, a 
“step,” is negotiable. Similarly a slope downward begins to afford falling when 
steep, and the brink of the cliff is dangerous—a falling-off place. Men have 
altered the layout of such slopes by building Stairways so as to facilitate the 
behavior of ascending and descending. The steps of a stairway are of such size as 
afford stepping up or down, given the size of the legs of a man. 

In short for the ordinary environment, there are barriers to locomotion in 
some directions. If there are barriers to locomotion in all directions the observer 
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is “imprisoned” as in the case of a complete enclosure, or cell. The situation of 
the saint who lived on top of a high pillar was also that of a prison, be it noted, 
although he was surrounded by brinks instead of walls. But ordinarily there are 
openings, that is to say paths, between barriers and then the special kind of 
locomotion that we call roundabout is afforded. A special kind of barrier, 
smaller than a wall or fence, is a simple obstacle. Like a wall, an obstacle affords 
collision but, being of animal size, it can be avoided without roundabout 
locomotion. 

The progress of locomotion, we can now observe, is visually guided, and it 
depends on the avoidance of obstacles, barriers, brinks, and surfaces of deep 
water. The steering of locomotion, the control of it, depends on the progressive 
perceiving of these features of the environment, their negative affordances. 
There will be more about the control of locomotion later, but it is worth 
recalling now that optical information is available in ambient light for the 
perceiving of these features of the layout as well as for the perceiving of 
locomotion itself. The features I have listed above are relevant to pedestrian 
locomotion but a modified list could be drawn up for the locomotion of birds, 
and for fish. 

The imminence of collision with an obstacle or barrier is optically specified for 
any kind of locomotion: walking, flying, or swimming. There are at least some 
general laws that hold for perception in all animals. The information for 
imminence of collision is a high rate of symmetrical outflow of part of the 
ambient optic array, the approach to the maximum possible visual solid angle 
which specifies zero distance. This can be described as “looming,” (Schiff, 
1965). The larger the silhouette the closer to contact or collision. 


The Affording of Concealment 


This is the place to describe an interesting kind of social behavior that is 
afforded by a cluttered environment of opaque surfaces. I mean the act of 
hiding, both the hiding of an object from other observers and the hiding of 
oneself from other observers. Concealing or screening one’s body is something 
that many animals do, both the hunted and the hunter, both prey and predator, 
and even human children at play. 

One of the rules of ecological optics is that at any fixed point of observation 
some parts of the environment are projected or revealed and the remaining parts 
are unprojected or concealed. The reciprocal of this rule is that the observer 
himself, his body, is revealed at some points of observation and concealed at the 
remaining points. An observer thus perceives not only that other observers are 
unhidden or hidden from him but also that he is unhidden or hidden from other 
observers. The practicing of this kind of perception is what babies do in playing 
“peek-a-boo” and what children do when they play “hide and seek.” The act of 
hiding is to position one’s body at a point of observation that is concealed at the 
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point of observation of another or other observers—to go to a hiding place. | 
omit the optics of peepholes; the reader can work it out for himself. 

All this depends on the perception of occluding edges in the layout. The 
reciprocity of the observer and the environment is once more emphasized. The 
greatest degree of concealment is afforded by an enclosure (as defined earlier) 
and complete concealment is afforded by a complete enclosure. What we call 
“privacy” in the design of housing is the providing of opaque enclosures. Note 
that the screening of perception is not the same as the barring of locomotion; a 
screen and a barrier may be different. An opaque and rigid sheet does both, but 
an opaque and flexible sheet like a cloth curtain affords locomotion without 
perception whereas a transparent and rigid sheet like a glass window affords 
perception without locomotion. And a translucent sheet affords illumination but 
not perception, as I pointed out in formulating ecological optics. I omit the 
complexities of one-way screens for vision but they can be worked out from the 
principles that govern semitransparency. 

Besides hiding himself an observer can hide portable objects from other 
observers. These are usually objects of value, so called. Both animals and men 
perform this sort of social behavior. Food objects, utensils, and money can be 
buried in the earth or concealed in a chest or put away in a drawer. All of us, the 
higher animals, look for good hiding places, both for ourselves and for our 
treasures, 


What Do Detached Objects Afford? 


A movable object affords an astonishing variety of behaviors, especially if it is 
small relative to the size of the animal under consideration. If so, it is portable, 
that is, it affords lifting and carrying. For an animal with hands, a primate, the 
object may (or may not) afford grasping. To be graspable, an object must have 
opposite surfaces separated by less distance than the span of the hand. It must 
have an appropriate width, and the width can be perceived visually. 

Of course an attached or immovable object may also be grasped but then it is 
not portable. Instead it affords support, as a tree branch supports a monkey. The 
rung of a ladder and the hand-hold of a mountain climber on a cliff face are 
graspable in this special sense, 

In general graspable detached objects afford manipulation. There are so many 
kinds of manipulation and so many kinds of manipulated objects to accompany 
them that we can only sample the set. In “a new terminology for surface 
layout,” I described sheets, sticks, fibers, and containers in geometrical terms, 
and I mentioned tools and clothing, but that was a bare beginning. Here are a 
few examples: 


1. An elongated object of moderate size and weight affords wielding. If used 
to hit or strike it is a club or hammer. If used by a chimpanzee behind bars to 


3. THE THEORY OF AFFORDANCES 75 


pull in a banana beyond his reach it is a sort of rake. In either case it is an exten- 
sion of the arm. A rigid staff also affords leverage and in that use is a lever. A 
pointed elongated object affords piercing; if large it is a spear, if small a needle 
or awl. 

2. A rigid object with a sharp dihedral angle, an edge, affords cutting and 
scraping. It is a knife. It may be designed for both striking and cutting and then 
it is an axe. 

3. A graspable rigid object of moderate size and weight affords throwing. It 
may be a missile or only an object for play, a ball. The launching of missiles by 
supplementary tools other than the hands alone, the sling, the bow, the catapult, 
the gun, and so on is one of the behaviors that makes man a nasty dangerous 
species. 

4. An elongated elastic object like a fiber, thread, thong, or rope affords 
knotting, binding, lashing, knitting, and weaving. These are kinds of behavior 
where manipulation leads to manufacture. 

5. A hand-held tool of enormous importance is one that, when applied to a 
surface, leaves traces and thus affords trace making. It may be a stylus, brush, 
crayon, pen, or pencil but if it marks the surface it can be used to depict and 
to write, to represent scenes and to specify words. 


We have thousands of names for such objects and we classify them in many 
ways, tools like pliers and wrenches, utensils like pots and pans, weapons like 
swords and pistols. All of these objects have properties or qualities: color, 
texture, composition, size, shape, and features of shape, not to mention mass, 
elasticity, rigidity, and the like. Nevertheless I suggest that what we perceive 
when we look at them are their affordances, not their qualities. We can, of 
course, discriminate these dimensional qualities if required to compare them as 
objects. But the unique combination of qualities that specifies what the object 
affords us is what we normally pay attention to. 

If this is true for the adult, what about the young child? There is now a great 
deal of evidence to show that the infant does not begin by first discriminating 
the qualities of objects and then learning the combinations of qualities that 
specify the objects themselves. Phenomenal objects are not built up of qualities. 
It is quite the other way around. Objects, more exactly the affordances of 
objects, are what the infant begins by noticing. The meanings are observed 
before the substances and surfaces are. Affordances are invariant combinations 
of variables. And it is only reasonable to suppose that it is easier to perceive an 
invariant combination than it is to perceive all the variables separately. 


What Do Other Animals and Other People Afford? 


The richest and most elaborate affordances of the environment are provided by 
other animals and, for us, other people. These are, of course, detached objects 
with topologically closed surfaces but they change the shape of their surfaces 
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while yet retaining the same fundamental shape. They move from place to place, 
changing the postures of their bodies, ingesting and emitting certain substances, 
and doing all this spontaneously, initiating their own movements, which is to say 
that their movements are animate. These bodies are subject to the laws of 
mechanics and yet not subject to the laws of mechanics. They are so different 
from ordinary objects that infants learn almost immediately to distinguish them 
from plants and nonliving things. When touched they touch back, when struck 
they strike back, in short they interact with the observer and with one another. 
Behavior affords behavior, and the whole subject matter of psychology and of 
the social sciences can be thought of as an elaboration of this basic fact. Sexual 
behavior, nurturing behavior, fighting behavior, cooperative behavior, economic 
behavior, political behavior—all depend on the perceiving of what another person 
or other persons afford—or sometimes on the misperceiving of it. 

What the male affords the female is reciprocal to what the female affords the 
male; what the infant affords the mother is reciprocal to what the mother 
affords the infant; what the prey affords the predator goes along with what the 
predator affords the prey; what the buyer affords the seller cannot be separated 
from what the seller affords the buyer, and so on. The perceiving of these mutual 
affordances is enormously complex but it is nonetheless lawful, and it is based 
on the pickup of the information in touch, sound, odor, taste, and ambient light. 
It is just as much based on stimulus information as is the simpler perception of 
the support that is offered by the ground under one’s feet. For other animals 
and other persons can only give off information about themselves insofar as they 
are tangible, audible, odorous, tastable, or visible. 

The other person, the generalized other, the alter as opposed to ego, is an 
ecological object with a skin, even if clothed. It is an object although it is not 
merely an object, and we do right to speak of you and he instead of it. But he 
has a surface that reflects light and the information to specify what he is, what 
he invites, promises, or threatens, and what he does, can be found in the light. 


Summary: Positive and Negative Affordances 


The foregoing examples of the affordances of the environment are enough to 
show how general and powerful the concept is. Substances have biochemical 
offerings, and afford manufacture. Surfaces afford posture, locomotion, colli- 
sion, manipulation, and in general behavior. Special forms of layout afford 
shelter and concealment. Fires afford being warmed and being burned. Detached 
objects, tools, utensils, weapons, afford special types of behavior to primates and 
men. The other animal and the other person provide mutual and reciprocal 
affordances at extremely high levels of behavioral complexity. At the highest 
level, when vocalization becomes speech and manufactured displays become 
images, pictures, and writing, the affordances of human behavior are staggering. 
No more of that will be considered at this stage except to point out that speech, 
pictures, and writing still have to be perceived. 
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At all these levels, from matter to men, we can now observe that some offerings 
of the environment are beneficial and some are injurious. These are slippery 
terms which should only be used with great care, but if their meanings are 
pinned down to biological and behavioral facts the danger of confusion can be 
minimized. First, consider substances that afford ingestion. Some afford nutri- 
tion for a given animal, some afford poisoning, and some are neutral. As I 
pointed out before, these facts are quite distinct from the affording of pleasure 
and displeasure in eating, for the experiences do not necessarily correlate with 
the biological effects. Second, consider the brink of a cliff. On the one side it 
affords walking-along, locomotion, whereas on the other it affords falling-off, 
injury. Third, consider a detached object with a sharp edge, a knife. It affords 
cutting if manipulated in one manner but it affords being cut if manipulated in 
another manner. Similarly, but at a different level of complexity, an ordinary 
metallic object affords grasping but if charged with current it affords electric 
shock. And fourth, consider the other person. The animate object can give you 
caresses or blows, contact comfort or contact injury, reward or punishment, and 
it is not always easy to perceive which will be provided. Note that all these 
benefits and injuries, these safeties and dangers, these positive and negative 
affordances are properties of things taken with reference to an observer but not 
properties of the experiences of the observer exclusive of the things. They are 
not subjective values; they are not feelings of pleasure or pain added to neutral 
perceptions. 

There has been endless debate among philosophers and psychologists as to 
whether values were physical or phenomenal, in the world of matter or only in 
the world of mind. For affordances as distinguished from values the debate does 
not apply. They are neither in the one world or the other inasmuch as the theory 
of two worlds is rejected. There is only one environment, although it contains 
many observers with limitless opportunities for them to live in it. 


THE ORIGIN OF THE CONCEPT OF AFFORDANCES 


The Gestalt psychologists recognized that the meaning or value of a thing seems 
to be perceived just as immediately as its color. The value is clear on the face of 
it, as we say, and thus it has a physiognomic quality in the way that the 
emotions of a man appear on his face. To quote from the Principles of Gestalt 
Psychology (Koffka, 1935): “Each things says what it is...a fruit says ‘Eat 
me’; water says ‘Drink me’; thunder says ‘Rear me’; and woman says ‘Love me’ 
[p. 7].” These values are a vivid and essential feature of the experience itself. 
Koffka did not believe that a meaning of this sort could be explained as a pale 
context of memory images or an unconscious set of response tendencies. The 
postbox “invites” the mailing of a letter, the handle “wants to be grasped,” and 
things “tell us what to do with them [p. 353].” Hence they had what Koffka 


called “demand character.” 
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Kurt Lewin had coined the term Aufforderungscharakter which had been 
translated as invitation-character (by J. F. Brown in 1929) and as valence (by D. 
K. Adams in 1931). The latter term came into general use. Valences for Lewin 
had corresponding vectors, which could be represented as arrows pushing the 
observer toward or away from the object. What explanation could be given for 
these valences, the characters of objects that invited or demanded behavior? No 
one, not even the Gestalt theorists, could think of them as physical and, indeed, 
they do not fall within the province of ordinary physics. They must therefore be 
phenomenal, given the assumption of dualism. If there were two objects, and if 
the valence could not belong to the physical object it must belong to the 
phenomenal object—to what Koffka called the “behavioral” object but not to 
the “geographical” object. The valence of an object was bestowed upon it in 
experience, and bestowed by a need of the observer. Thus Koffka argued that 
the postbox has a demand character only where the observer needs to mail a 
letter. He is attracted to it when he has a letter to post, not otherwise. The value 
of something was assumed to change as the need of the observer changed. 

The concept of affordance is somewhat related to these concepts of valence, 
invitation, and demand but with a crucial difference. The affordance of some- 
thing does not change as the need of the observer changes. Whether or not the 
affordance is perceived or attended to will change as the need of the observer 
changes but, being invariant, it is always there to be perceived. An affordance is 
not bestowed upon an object by a need of an observer and by his act of 
perceiving it. The object offers what it does because it is what it is. To be sure, 
we define what it is in terms of ecological physics instead of physical physics, 
and it therefore possesses meaning and value to begin with. But this is meaning 
and value of a new sort. 

For Koffka it was the phenomenal postbox that invited letter mailing, not the 
physical postbox. But this duality is pernicious. I prefer to say that the real 
postbox (the only one) affords letter-mailing to a letter-writing human in a 
community with a postal system. This fact is perceived when the postbox is 
identified as such, and it is apprehended whether the postbox is in sight or out 
of sight. To feel a special attraction to it when one has a letter to mail is not 
surprising but the main fact is that it is perceived as part of the environment—as 
an item of the neighborhood in which we live. Everyone above the age of six 
knows what it is for and where the nearest one is. The perception of its 
affordance should therefore not be confused with the temporary special attrac- 
tion it may have. 

The Gestalt psychologists explained the directness and immediacy of the 
experience of valences by postulating that the ego is an object in experience and 
that a “tension” may arise between a phenomenal object and the phenomenal 
ego. When the object is in “a dynamic relation with the ego” said Koffka, it has 
a demand character. Note that the “tension,” the “relation,” or the “vector” 
must arise in the “field” that is, in the field of phenomenal experience. Although 
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many psychologists find this theory intelligible, I do not. There is an easier way 
of explaining why the values of things seems to be perceived immediately and 
directly. It is because the affordances of things for an observer are specified in 
stimulus information. They seem to be perceived directly because they are 
perceived directly. 

The accepted theories of perception, to which the Gestalt theorists were 
objecting, implied that no experiences were direct except sensations, and that 
sensations mediated all other kinds of experience. Bare sensations had to be 
clothed with meaning. The seeming directness of meaningful perception was 
therefore an embarrassment to the orthodox theories and the Gestaltists did 
right to emphasize it. They began to undermine the sensation-based theories. But 
their own explanations of why it is that a fruit says “Eat me” and woman says 
“Love me” is a bit strained. The Gestalt psychologists objected to the accepted 
theories of perception but they never managed to go beyond them. 


THE PROCESS OF PERCEIVING AFFORDANCES 


The definition of an affordance can now be elaborated by saying that it is a 
combination of physical properties of the environment that is uniquely suited to 
a given animal—to his nutritive system or his action system or his locomotor 
system. A substance is chemically valuable relative to a given nutritive system, 
herbivorous or carnivorous. An object is valuable relative to a given action 
system, one with claws or another with hands. A surface layout has locomotor 
value relative to the kind of legs and feet the animal possesses. 

If there is information in ambient light to specify substances, solid objects, and 
surface layouts there is information to specify their affordances for eating, for 
manipulation, and for locomotion, that is, for behavior. When an observer 
perceives edibility he perceives it in relation to his mouth and teeth and digestive 
system; when he perceives manipulability he perceives it in relation to his hands, 
to which the object or tool is suited; when he perceives the possibility of 
locomotion he perceives it in relation to what his locomotor system is capable of 
in walking or climbing, the slopes it can descend or the ditches it can jump over. 
This is only to reemphasize that perception of the environment is inseparable 
from proprioception of one’s own body—that egoreception and exteroception 
are reciprocal. A man can bite into an apple but not a rock; he can get a grip on 
a handle but not on a wall; he can jump over a gap commensurate with his size 
and strength but he will fall into a crevasse that is too wide to jump. He 
measures these features of the environment by the standard of his body. And 
this is just as true for a mouse as it is for a man. 

Many of the chemical, physical, and geometrical properties of the natural 
environment are specified in ambient light, as I tried to show in my discussion of 
ecological optics. The hypothesis I proposed in The Senses Considered as 
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Perceptual Systems (Gibson, 1966) is that the visual system of a mature observer 
can pick up this information or else can be altered by perceptual learning so that 
it is picked up. I now want to extend this proposal to cover the perception of 
affordances. These unique combinations of chemical, physical, and geometrical 
properties are also specified in ambient light. A compound invariant of optical 
structure is just another invariant. And a genuinely invariant compound can 
presumably be detected as a unit, without any need to associate the com- 
ponents. In classical terminology, several “stimuli” that always go together 
constitute one “stimulus.” If these unique optical compounds are meaningful in 
the sense that they specify benefits and dangers for the given observer they 
should be easier to detect, that is, picked up with less learning, than other 
combinations of optical information that are not ego-related. The properties of 
things as such are less important to an observer than the affordances for him. 

For example, the meaning of an arbitrary combination of properties invented 
by an experimenter in a laboratory should be harder to detect than the meaning 
of a natural invariant compound. An ape can learn that a one-inch flat blue 
triangle on the panel of a discrimination apparatus specifies a piece of banana 
behind the panel. But he should learn more easily that a 6-inch long rounded 
yellow surface specifies a banana behind its skin. The solid yellow object says 
“Eat me,” in Koffka’s words, more directly than does the flat blue form. The 
panel of the apparatus may come to say “Push me,” but only that. 

If this is true, some compound invariants specify their affordances directly and 
we say that the object or surface looks like what it is. Other compound 
invariants do not specify their affordances so directly and then we are apt to say 
that the object or surface does not look like what it is. The fact that a small 
piece of metal in a complex household gadget affords electric shock may be a 
hidden fact; to perceive it entails the apprehension of a set of concealed 
connections. Learning to apprehend electrical connections is rather difficult, and 
even the electrician sometimes makes mistakes. 


The Misperceiving of Affordances 


The brink of a cliff affords falling off; it is in fact dangerous and it looks 
dangerous to us, It seems to look dangerous to many other terrestrial animals 
besides ourselves, including infant animals. Experimental studies have been made 
of this fact. If a sturdy sheet of plate glass is extended out over the edge it no 
longer affords falling and in fact is not dangerous, but it may still look 
dangerous. The optical information to specify depth-downward-at-an-edge is still 
present in the ambient light; for this reason the device was called a “visual cliff” 
by Gibson and Walk (1960). Haptic information was available to specify an 
adequate surface of support but this was contradictory to the optical informa- 
tion. When human infants at the crawling stage of locomotion were tested with 
this apparatus many of them would pat the glass with their hands but would not 
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venture out on the surface. The babies misperceived the affordance of a trans- 
parent surface for support, and this result is not surprising. 

Similarly, a man can misperceive the affordance of a sheet of glass by 
mistaking a closed glass door for an open doorway and attempting to walk 
through it. He then crashes into the barrier and is injured. The affordance of 
collision was not specified by the outflow of optical texture in the array, or 
insufficiently specified. He mistook glass for air. The occluding edges of the 
doorway were specified and the empty visual solid angle opened up sym- 
metrically in the normal manner as he approached, so his behavior was properly 
controlled, but the imminence of collision was not noticed. A little dirt on the 
surface, or highlights, would have saved him. 

These two cases are instructive. In the first a true affordance of support went 
unexploited because a false negative affordance of falling opposed it. In the 
second a negative affordance of collision went unnoticed and a positive afford- 
ance of exiting (going out) was mistakenly registered. A failure to perceive what 
is present in the environment and a perceiving of something not present in the 
environment are both cases of misperception. Usually they go together. To see 
what is there implies not seeing what is not there. 

The very possibility of perceiving entails, of course, the possibility of misper- 
ceiving. The problem for the psychology of perception is to discover the 
conditions that govern both. For a theory of visual perception based on the 
pickup of available information, a theory of direct perception not mediated by 
subjective sensations, misperception can be explained in two general ways: either 
the available information is inadequate or, if not, the process of information 
pickup is deficient. On the one hand, visual perception fails in the dark because 
of the absence of stimulation, and it fails in a fog-filled medium because of the 
absence of structure in ambient light even with the presence of stimulation. 
Information is not available. It may also fail because the available optical 
information is equivocal or contradictory, or even sometimes because it is 
discrepant with the information given to touch, although this is rare. On the 
other hand invariants may fail to be picked up because the eyes are closed, or 
because the lens of the eye is opaque, or because the retina is diseased or 
dazzled, or because the optic nerve is severed. At the level of the whole visual 
system information may not be registered because the retina-nerve-brain-eye 
system is immature, or because the observer has not yet learned to extract the 
specifying invariants, or simply because the observer fails to look around him, or 
fails to look at the fine details. I have described the possible reasons for 
misperceiving in Chapter 14 of The Senses Considered as Perceptual Systems 
(Gibson, 1966). 

No wonder, then, that quicksand is sometimes mistaken for sand, that a pitfall 
can be mistaken for solid ground, that poison ivy is sometimes mistaken for ivy, 
and that acid can be taken for water. A wildcat is not easy to distinguish from a 
cat, and a thief may look like an honest man. When Koffka asserted that “each 
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thing says what it is” he neglected to mention that it may lie. The affordances of 
danger are sometimes hidden, like the electric shock in the radio cabinet and the 
shark under the calm water. 

Nevertheless, however true all this may be, the basic affordances of the 
terrestrial environment are perceivable, and are usually perceivable directly, 
without an excessive amount of learning. The reason is that the basic properties 
of the environment that combine to make an affordance are specified in the 
structure of ambient light and that hence the affordance itself is specified in 
ambient light. And, moreover, an invariant variable that is commensurate with 
the body of the observer himself is more easily picked up than one not 
commensurate with his body. 
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INTRODUCTION 


The 1960s saw a burst of new approaches to persistent problems in psychology. 
For me these approaches were heralded in the concepts of Image and Plan 
(Pribram, 1960). In a series of papers and books, often with the help of 
colleagues, I attempted to portray the power of these conceptions. At the 
neurological level, a two-process mechanism was detailed to show how Images 
and Plans were, in fact, generated (Pribram, 1971). At the behavioral level the 
concept Plan originated early on in observed similarities in the organization of 
serial actions and the organization of computer programs (Pribram, 1960). The 
concept Image took somewhat longer to ground in a model that allowed specific 
hypotheses to be generated. But by the mid-1960s it became clear that optical 
information processing systems could provide this model, especially in the 
construction of holographically produced Images (Pribram, 1972). These rather 
sketchy proposals have gradually been filled out with the accumulation of data 
from several laboratories, including my own. Languages of the Brain (Pribram, 
1971) spells out the relevance of these data to the theory and more recent 
additions are to be found in two papers: “The Holographic Hypothesis of 
Memory Structure in Brain Function and Perception” (Pribram, Nuwer, & 
Baron, 1974) and “How Is It that Perceiving So Much We Can Do So Little?” 
(Pribram, 1974a). 

Rather than detail once again the supports for the concepts of Image and Plan, 
I want here to address a set of specific issues that derive from the theories as 
they have been developed and to show that the computer theory of Plans and 
the holographic theory of Images are not mutually exclusive but stand in 
relation to each other much as other fundamental scientific theories (specifically 


theories in theoretical physics) do today. 
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My interest in these issues has several roots. The tap root concerns the 
two-process mechanism of brain function involving discrete nerve impulses and 
the pre- and postsynaptic depolarizations and hyperpolarizations that constitute 
a dynamic slow potential microstructure best described by continuous wave 
equations. This led me to inquire of my son, Professor John K. Pribram, a 
physicist, some details on the conceptual aspects of the parallel problem in 
quantum physics. The results of this inquiry are described below. 

A second major root stems from discussions with Professor Daniel Pollen on 
the nature of the Fourier or Fourier-like process in the visual system. We 
repeatedly puzzled on the nature of the reality imaged by the process. The 
cortical mechanism which we thought to be holographic stems from transforma- 
tions of a series of retinal images. These images themselves were constructed by 
the optics of the eye. Could it be that the retinal image was a special case 
(similar to a photographic image) in a series of reversible transformations that 
include other more holographic-like stages? Do we need a cortical imaging 
process at all? 

It finally occurred to me that the questions critical to these issues related to 
the nature of the external “reality” that was being transformed by the orga- 
nism’s perceptual mechanisms. In this respect, therefore, I had come to the same 
point of inquiry as James Gibson in the formulation of his elegant program of 
research. And it also led directly (on recommendation from J. K. Pribram) to 
interaction with theoretical physics in the person of David Bohm who in his own 
way came to Gibson’s position (Bohm, 1965, Appendix). Some of the early 
fruits of these inquiries and interactions are described below. 


THE HOLONOMIC THEORY 


Holograms provide a powerful mechanism for storing the image construction 
properties of optical information processing systems. What called attention to 
holograms is their distributed information state which makes them like the 
brain, highly resistant to damage. In addition, the holographic state allows a 
fantastic memory storage capacity: some hundred million bits of retrievable 
information have been stored in a cubic centimeter of holographic memory. This 
is accomplished by separately storing modulations of one or another spatial or 
temporal frequency. It is somewhat as if there were myriads of FM (frequency 
modulation) radios compressed into a tiny space. The short wave length of light 
(as compared to sound) makes such capabilities possible. In the brain, the short 
wave lengths characterizing the slow potential microstructure can be assumed to 
serve in a similar fashion. 

There are other properties (e.g., associative recall; translational, i.e., positional, 
and size invariance) of holograms that make the analogy with brain function in 
perception and memory attractive. These have been presented in another paper 
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(Pribram, Nuwer, & Baron, 1974). Here I want to emphasize that testable 
hypotheses can be formulated and models of actual brain function can be 
proposed within the domain of what can loosely be called the holographic 
properties of optical information processing systems. We have reviewed the 
evidence for image construction by the brain. What assemblies of neurons (and 
their processes), if any, function as true Fourier holograms? Which brain struc- 
tures function more like Fresnel holograms? Which mimic a Fourier process by 
convolving, integrating neighboring neural events and those at successive stages? 
These questions are being asked and experiments are being performed to provide 
answers. 

As might be expected, such experiments have already encountered one serious 
obstacle in drawing too close a parallel between optical information processes 
and image construction by the brain. This obstacle concerns the size of the 
receptive fields recorded for cells in the primary visual projection system. For 
example, the projection from the macular portion of the retina, the foveal 
receptive fields, is extremely small—some 3-5° of visual angle as a maximum. A 
hologram of this size will hardly account for the fact that information becomes 
distributed across the entire visual system as indicated by the evidence from 
resections and from electrophysiological recordings. 

A search has therefore been made for larger receptive fields that integrate the 
input from the smaller fields of the primary projection cortex. Such larger fields 
have been found in the cortex that surrounds the primary projection areas. It 
would be simple if one could assume that here, rather than in the primary 
projection cortex, the true holographic process takes place. 

But this simple assumption runs contrary to other evidence. First, it would not 
account, by itself, for the distribution of information within the projection 
cortex. Second, complete resection of this peri projection cortex (where the 
larger receptive fields are found) produces no permanent damage to image 
construction as far as one can tell from animal experiments (Pribram, Spinelli, & 
Reitz, 1969). 

Beyond these visual areas of the brain cortex, however, there is another, lying 
on the inferior surface of the temporal lobe which, when it is resected, leaves 
monkeys markedly and permanently impaired in their ability to make visual 
discriminations (Pribram, 1954, 1960, 1969). This impairment is limited to the 
visual mode (H. Pribram & Barry, 1956; M. Wilson, 1957). Only visual perfor- 
mances demanding a choice are impaired; other visual functions, such as tracking 
a signal, remain intact (Pribram, 1971, Chapter 17). The difficulty involves the 
ability to selectively attend to visual input (Gerbrandt, Spinelli, & Pribram, 
1970; Rothblat & Pribram, 1972; Gross, 1972). 

Much to everyone’s surprise, this visual “association” area (as the area with 
comparable function is known in man (Milner, 1958) appears to function 
remarkedly well when all known visual input to it is destroyed. As already 
noted, removal of the perivisual cortex has little permanent effect; destruction 
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of the thalamic input (from the pulvinar) to the inferior temporal cortex has no 
effect whatsoever (Mishkin, 1972; Ungerleiter, personal communication). Even 
combined lesions of perivisual and thalamic inputs do not permanently disrupt 
visual discriminations. 

These data make plausible the hypothesis that the inferior temporal cortex 
exerts its effect on vision via an output to the primary visual projection system 
(Pribram, 1958). Evidence in support of this hypothesis has accrued over the 
past 15 years: the configuration and size of visual receptive fields can be altered 
by electrical stimulation of the inferior temporal cortex (Spinelli & Pribram, 
1967); recovery cycles in the visual projection systems are shortened by such 
stimulation (Spinelli & Pribram, 1966); the pathways from the inferior temporal 
cortex have been traced (Whitlock & Nauta, 1956; Reitz & Pribram, 1969). 

Thus, another, more specific hypothesis can be entertained, namely, the 
suggestion that the inferior temporal cortex helps to program the functions of 
the primary visual projection systems. Specifically, such programming, as well as 
programming by input from sensory receptors, could “get together” the dis- 
tributed store of information from the. various loci of restricted receptive field 
size. If the relevant loci were addressed in unison they would, in fact, function 
like a hologram. 

The difference, therefore, between brain function and the function of optical 
information processing systems is the one set out at the beginning of this 
chapter. Brain is both an image construction and a programming device. Optical 
systems construct only images. 

The thesis presented here, therefore, suggests that the holographic-like store of 
distributed information in the primary visual projection system is akin to the 
distributed memory bank of a computer. The computer’s memory is organized 
more or less randomly; the brain’s memory has been stored along holographic 
principles. Both must be addressed by programs which access the appropriate 
“bits” of information. The computer does this serially; the brain, to a large 
extent, simultaneously, by pathways that allow signals to be transmitted in 
parallel. Such simultaneity in function produces momentary brain states that are 
akin to the holographic patterns that can be stored on film. 

Because of these differences between brain and optical systems, it may be 
better to talk about brain function as holonomic rather than just holographic or 
hologrammic. The term holonomic is used in engineering whever the systems, in 
an interactive set of such systems, are reasonably linear in their function. 
Linearity allows the computation of the functions of each system and therefore 
an estimate of the amount of their interaction—the “degrees of freedom” that 
characterize the interactive set. The interactions are known as the holonomic 
constraints on the system. In the context of the model of brain function in 
vision suggested here, the neural systems that determine any momentary visual 
state would have to be shown to be linear; then the amount of interaction 
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among the systems in producing the holographic visual state would appear as the 
degrees of freedom characterizing that state. 

Evidence is available to show that the visual system, despite local non- 
linearities, acts linearly overall above threshold (e.g., Ratliff, 1965). This is the 
case in other neural systems, notably the motor system (Granit, 1970). It is thus 
reasonable to propose that the holonomic model applies to brain functions other 
than visual. Support for such a proposal comes from work on the auditory (von 
Békésy, 1960), somatosensory (von Békésy, 1959) and even gustatory (von 
Békésy, 1967; Pfaffmann, 1960) and olfactory systems (Gesteland, Lettvin, Pitts, 
& Chung, 1968). 

Briefly summarizing, the holonomic model of brain function proposed that the 
brain partakes of both computer and optical information processes. The brain is 
like a computer in that information is processed in steps by an organized and 
organizing set of rules. It differs from current computers in that each step is 
more extended in space—brain has considerably more parallel processing capabil- 
ity than today’s computers. 

This parallel processing aspect of brain function leads to another difference. 
The rules of parallel processing are more akin to those that apply to optical 
information processes than they are to those used in current serial computers. 
Thus the momentary states set up by the programming activity are considerably 
like those of image constructing devices, that is, holographic. Thus memory 
storage is also holographic rather than random as in today’s computers. This 
does not deny, however, that storage of rules also takes place—as it does in 
machine peripherals (e.g., tapes for minicomputers). What the model requires is 
that the “deep structure” of the memory store is holographic. 

Since the holographic state is composed by programs and since the distributed 
store must be got together by the actions of and interactions among programs, 
the holographic brain state can be analyzed according to the systems that 
produces it. Thus the holonomic constraints or degrees of freedom that charac- 
terize the holographic state can be determined. The holonomic model of brain 
function is therefore mathematically precise, and its assumptions (such as overall 
linearity of component programming systems) and consequences (the distributed 
nature of the deep structure of the memory store) are, at least in principle, 
testable. 


IS PERCEPTION DIRECT OR CONSTRUCTIONAL? 


I want now to address some consequences to psychology (and perhaps to 
philosophy) of the holonomic theory of brain function. The theory, as we have 
seen, (1) stems from the metaphors of machine and optical information process- 
ing systems; (2) has developed by analogy to those systems, spelling out some 
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similarities and some differences; until (3) a testable holonomic model of brain 
function could be proposed. One way of understanding the model better is to 
compare it to another and to observe its relative explanatory power. 

An apparent alternative to the “holonomic” model is presented by James 
Gibson’s (1966) comprehensive “ecological” model of perception. Gibson’s 
model proposes that the “information” perceived is inherent in the physical 
universe and that the perceiver is sensitive to whatever information remains 
invariant across transformations produced by changes in the environment, by 
organism-environment displacements, and by the organism’s processing ap- 
paratus. The key concept in the ecological theory is “direct perception”—the 
environment as an ecological niche is directly apprehended by the perceiver. 

By contrast, the holonomic theory is constructional. Images are constructed 
when input from inferior temporal cortex (or its analog in other perceptual 
systems—see Pribram, 1974) activates, organizes the distributed holographic 
store. Images are produced and are therefore as much a product of the “informa- 
tion residing in” the organism, as they are of “information” contained in the 
environment. Philosophically speaking, the holonomic model is Kantian and 
Piagetian; the ecological model partakes of a critical realism. 

Clinical neurological experience wholly supports the holonomic view. Patients 
are seen who complain of macropsia and micropsia and other bizarre distortions 
of visual space. For instance, I once had a patient who, after a blow on the head, 
experienced episodes of vertigo during which the visual world went spinning. His 
major complaint was that every so often, when his perceptions again stabilized, 
they left him with the world upside down until the next vertigo which might 
right things once again. He had developed a sense of humor about these 
experiences, which were becoming less frequent and of shorter duration: his 
major annoyance he stated to be the fact that girls’ skirts stayed up despite the 
upside-down position! 

Further “clinical” evidence in support of the holonomic model comes from 
the experimental laboratory. Resections of the primate inferior temporal cortex 
markedly impair size constancy—the transformations across various distances 
over which environmental information must remain invariant in order to be 
“directly” perceived as of the same size. 

Yet Gibson (1966, 1968) and others who share his views (e.g., Johannson, 
1973; and more recently Hebb, in press), make a good case that in normal 
adult humans, perception is direct. A series of ingenious experiments has 
shown that by appropriate manipulations of “information,” illusions indis- 
tinguishable from the “real” can be created on a screen. The demonstrations are 
convincing and make it implausible to maintain a solopsistic or purely idealistic 
position with respect to the physical universe—that nothing but a buzzing 
blooming confusion characterizes external reality. With respect to the experi- 
ments he has devised, Gibson is correct. 
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Furthermore, if perception is direct, a dilemma for the holonomic theory 
would be resolved. When an optical hologram produces an image, a human 
observer is there to see it. When a neural hologram constructs an image, who is 
the observer? Where is the “little man” who views the “little man”? Direct 
perception needs no little men inside the head. Gibson (1966), in fact, deplores 
the term image because it calls up the indirectness of the representational 
process. However, if what we “directly perceive” is a constructed image and not 
the true organization of the external world—and we mistake this perception as 
veridical—perception would be both direct and constructional. 

The question to be answered therefore is by what mechanism can perception 
be both direct and constructional? A clue to the resolution of this dilemma 
comes from the Gibson (and Johansson) experiments themselves. Their displays 
produce the illusion of reality. When we know the entire experiment we can 
label the percept as an illusion, even though we directly experience it. In a 
similar fashion, the sound coming from the speakers of a stereophonic system is 
experienced directly. When we manipulate the dials of the system (changing the 
phase of the interacting, interfering sound waves) so that an equal part of the 
sound comes from each of the two speakers, we say that an illusion has been 
produced—the sound has been projected to the space between the speakers. 
Perception continues to be direct, but considerable computation is involved in 
determining the conditions over which the “information” contained in the 
sound remains invariant. We do not naively assume that the fireplace generates 
the sound. Despite the directness of the perception, it can be superficially 
misleading as to the actual characteristics of the physical universe. 

The issues appear to be these. Gibson abhors the concept “image.” As already 
noted, he emphasizes the “information” which the environment “affords” 
the organism. As an ecological theorist, however, Gibson recognizes the impor- 
tance of the organism in determining what is afforded. He details especially the 
role of movement and the temporal organization of the organism-environment 
relationship that results. Still, that organization does not consist of the construc- 
tion of percepts from their elements; rather the process is one of responding 
to the invariances in that relationship. Thus perceptual learning involves pro- 
gressive differentiation of such invariances, not the association of sensory 
elements. 

The problem for me has been that I agree with all of the positive contributions 
to conceptualization which Gibson has made, yet find myself in disagreement 
with his negative views (such as on “mages”) and his ultimate philosophical 
position. If indeed the organism plays such a major role in the theory of 
ecological perception, does not this entail a constructional position? Gibson’s 
answer is no, but perhaps this is due to the fact that he (in company with so 
many other psychologists) is basically uninterested in what goes on inside the 


organism. 
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What then does go on in the perceptual systems that is relevant to this 
argument? I believe that to answer this question we need to analyze what is 
ordinarily meant by “image.” Different disciplines have very different defini- 
tions of this term. 

The situation is similar to that which obtained in neurology for almost a 
century with regard to the representation we call “motor.” In that instance the 
issue was stated in terms of whether the representation in the motor cortex was 
punctile or whether in fact moyements were represented. A great number of 
experiments were done. Many of them using anatomical and discrete electrical 
stimulation techniques showed an exquisitely detailed anatomical mapping be- 
tween cortical points and muscles and even parts of muscles (Chang, Ruch, & 
Ward, 1947). The well-known homunculus issued from such studies on man 
(Penfield & Boldrey, 1937). 

But other, more physiologically oriented experiments provided different re- 
sults. In these it was shown that the same electrical stimulation at the same 
cortical locus would produce different movements depending on such other 
factors as position of the limb, the density of stimulation, the state of the 
organism (his respiratory rate, etc.). For the most part, one could conceptualize 
the results as showing that the cortical representation consisted of movements 
centered on one or another joint (e.g., Phillips, 1965). The controversy was thus 
engaged—proponents of punctate muscle representation vis-à-vis the proponents 
of the representation of movement. 

I decided to repeat some of the classical experiments in order to see for myself 
which view to espouse (reviewed in Pribram, 1971, Chapters 12 and 13). Among 
the experiments performed was one in which the motor cortex was removed 
(unilaterally and bilaterally) in monkeys who had been trained to open a rather 
complex latch box to obtain a peanut reward (Pribram, Kruger, Robinson, & 
Berman, 1955-1956). My results in this experiment were, as in all others, the 
replication of the findings of my predecessors. The latch box was opened, but 
with considerable clumsiness, thus prolonging the time taken some two- to 
threefold. 

But the interesting part of the study consisted in taking cinematographic 
pictures of the monkeys’ hands while performing the latch-box task and in their 
daily movements about the cage. Showing these films in slow motion we were 
able to establish to our satisfaction that no movement or even sequence of 
movements was specifically impaired by the motor cortex resections! The deficit 
appeared to be task specific, not muscle or movement specific. 

My conclusion was therefore that, depending on the Zevel of analysis, one 
could speak of the motor representation in the cortex in three ways. Anatomi- 
cally, the representation was punctate and of muscles, Physiologically, the 
representation consisted of mapping the muscle representation into movements, 
most likely around joints as anchor points. But behavioral analysis showed that 
these views of the representation were incomplete. No muscles were paralyzed, 
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no movements precluded by total resection of the representation. Action, 
defined as the environmental consequence of movements, was what suffered 
when motor cortex was removed. 

The realization that acts, not just movements or muscles, were represented in 
the motor systems of the brain accounted for the persistent puzzle of motor 
equivalences. We all know that we can, though perhaps clumsily, write with our 
left hands, our teeth, or, if necessary, our toes. These muscle systems may never 
have been exercised to perform such tasks, yet immediately and without practice 
can accomplish at least the rudiment required. In a similar fashion, birds will 
build nests from a variety of materials, and the resulting structure is always a 
habitable facsimile of a nest. 

The problem immediately arose of course as to the precise nature of a 
representation of an act. Obviously there is no “image” of an action to be 
found in the brain if by “image” one means specific words or the recognizable 
configuration of nests. Yet some sort of representation appears to be engaged 
that allows the generation of words and nests—an image of what is to be 
achieved, as it were. 

The precise composition of images-of-achievement remained a puzzle for many 
years. The resolution of the problem came from experiments by Bernstein 
(1967) who made cinematographic records of people hammering nails and 
performing similar more or less repetitive acts. The films were taken against 
black backgrounds with the subjects dressed in black leotards. Only joints were 
made visible by placing white dots over them. 

The resulting record was a continuous wave form. Bernstein performed a 
Fourier analysis on these wave forms and was invariably able to predict within a 
few centimeters the amplitude of the next in the series of movements. 

The suggestion from Bernstein’s analysis is that a Fourier analysis of the 
invariant components of motor patterns (and their change over time) is com- 
putable and that an image-of-achievement may consist of such computation. 
Electrophysiological data from unit recordings obtained from the motor cortex 
have provided preliminary evidence that, in fact, such computations are per- 
formed (Evarts, 1967, 1968). 

By “motor image” therefore we mean a punctate muscle-brain connectivity 
that is mapped into movements over joints in order to process environmental 
invariants generated by or resulting from those movements. This three-level 
definition of the motor representation can be helpful in resolving the problems 
that have become associated with the term “image” in perceptual systems. 

In vision, audition, and somesthesis (and perhaps to some extent in the 
chemical senses as well) there is a punctate connectivity between receptor 
surface and cortical representation. This anatomical relationship serves as an 
array over which sensory signals are relayed. At a physiological level of analysis, 
however, a mapping of the punctate elements of the array into functions occurs. 
This is accomplished in part by convergences and divergences of pathways but 
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even more powerfully by networks of lateral interconnectivities, most of which 
operate by way of slow graded dendritic potentials rather than by nerve impulses 
propagated in long axons. Thus in the retina, for instance, no nerve impulses can 
be recorded from receptors, bipolar or horizontal cells. It is only in the ganglion 
cell layer, the last stage of retinal processing, that nerve impulses are generated 
to be conducted in the optic nerve to the brain (reviewed by Pribram, 1971, 
Chapters 1, 6, and 8). These lateral networks of neurons operating by means of 
slow graded potentials thus map the punctate receptor—brain connectivities into 
functional ambiences. 

By analogy to the motor system, this characterization of the perceptual 
process is incomplete. Behavioral analysis discerns perceptual constancies just as 
this level had to account for motor equivalences. In short, invariances are 
processed over time and these invariances constitute the behaviorally derived 
aspects of the representation (e.g., Pribram, 1974b). Ordinarily, an organism’s 
representational processes are called images and there is no good reason not to 
use this term. But it must be clearly kept in mind that the perceptual image, just 
as the motor image, is more akin to a computation than to a photograph. 

We have already presented the evidence that for the visual system at least, this 
computation (just as in the motor system) is most readily accomplished in the 
Fourier or some similar domain. The evidence that pattern perception depends 
on the processing of spatial frequencies has been reviewed. It is, after all, this 
evidence more than any other that has suggested the holonomic hypothesis of 
perception. 

The perceptual image, so defined, is therefore a representation, a mechanism 
based on the precise anatomical punctate receptor-cortical connectivity that 
composes an array. This array is operated upon by lateral interconnections that 
provide the ambiences which process the invariances in the organism’s input. The 
cortical representation of the percepts go therefore beyond the anatomical 
representations of the receptor surfaces just as the cortical representation of 
actions goes beyond the mere anatomical representations of muscles. 

It is, of course, a well-known tenet of Gestalt psychology that the percept is 
not equivalent of the retinal (or other receptor) image. This tenet is based on the 
facts of constancy (e.g., size) and the observations of illusions. Neurophysiolo- 
gists, however, have only recently begun to seriously investigate this problem. 
Thus Horn (Horn, Stechler, & Hill, 1972) showed that certain cells in the 
brainstem (superior colliculus) maintained their firing pattern to an environmen- 
tal stimulus despite changes in body orientation; and in my laboratory Spinelli 
(1970) and also Bridgeman (1972) using somewhat different techniques demon- 
strated constancy in the firing pattern of cortical neurons over a range of body 
and environmental manipulations. Further, neurobehavioral studies have shown 
that size constancy is impaired when perivisual and inferior temporal cortex is 
removed (Humphrey & Weiskrantz, 1969; Ungerleider, 1975). 

The fact that the cortex becomes tuned to environmental invariances rather 
than just to the retinal image is borne out dramatically by a hitherto unex- 
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plained discrepancy in the results of two experiments. In both experiments a 
successful attempt was made to modify the orientation selectivity of the cortical 
neurons of cats by raising them from birth in environments restricted to either 
horizontal or vertical stripes. In one experiment (Blakemore, 1974) the kittens 
were raised in a large cylinder appropriately striped. A collar prevented the 
animals from seeing parts of their bodies—so they were exposed to only the 
stripes. However, and this turns out to be critical, the kittens could observe the 
stripes from a variety of head and eye positions. In contrast, in the other 
experiment, which was performed in my laboratory (Hirsch & Spinelli, 1970), 
head and eye turning was prevented from influencing the experiment by tightly 
fitting goggles onto which the stripes were painted. In both experiments cortical 
neurons were found to be predominantly tuned to the horizontal or vertical 
depending on the kittens’ environment, although the tuning in Blakemore’s 
experiments appeared to be somewhat more effective. The discrepancy arose 
when behavioral testing was instituted. Blakemore’s kittens were consistently 
and completely deficient in their ability to follow a bar moving perpendicular to 
the orientation of the horizontally or vertically striped environment in which 
they had been raised. In our experiment Hirsch, despite years of effort using a 
great number of quantitative tests, could never demonstrate any change in visual 
behavior! The tuning of the cortical cells to the environmental situation which 
remained invariant across transformations of head and eye turning was behav- 
iorally effective; the tuning of cortical cells to consistent retinal stimulation had 
no behavioral consequences. 

These results are consonant with others obtained in other sensory modes and 
also help to provide some understanding of how brain processing achieves our 
perception of an objective world separated from the receptor surfaces which 
interface the organism with his environment. 

Von Békésy (1967) has performed a large series of experiments on both 
auditory and somatosensory perceptions to clarify the conditions that produce 
projection and other perceptual effects. For example, he has shown that a series 
of vibrators placed on the forearm produce a point perception when the phases 
of the vibrations are appropriately adjusted. Once again, in our laboratory we 
found that the cortical response to the type of somatosensory stimulation used 
by von Békésy was consonant with the perception, not with the pattern of 
physical stimulation of the receptor surface (Dewson, 1964; Lynch, 1971). 
Further, von Bekesy showed that when such vibrators are applied to both 
forearms, and the subject wears them for awhile, the point perception suddenly 
leaps into the space between the arms. 

Other evidence for projection comes from the clinic. An amputated leg can 
still be perceived as a phantom for years after it has been severed and pickled in 
a pathologist’s jar. A more ordinary experience comes daily to artisans and 
surgeons who “feel” the environment at the ends of their tools and instruments. 

When validation is lacking or incomplete, we tend to call the perception an 
illusion and pursue a search for what physical events may be responsible for the 
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illusion. Gibson and his followers are correct, perception is direct. They are 
wrong if and when they think that this means that a constructional brain process 
is ruled out or that the percept invariably and directly gives evidence of the 
physical organization that gives rise to perception. 

As noted, there is altogether too much evidence in support of a brain 
constructional theory of perception. The holonomic model, because of its 
inclusion of parallel processing and wave interference characteristics readily 
handles the data of projection and illusion that make up the evidence for direct 
perception. The holonomic model also accounts for’ the “directness” of the 
perception; holographic images are not located at the holographic plane, but in 
front or beyond it, away from the constructional apparatus and more into the 
apparently “real,” consensually validatable external world. 


STRUCTURE AND PROBABILITY 


In the concluding part of this chapter, I want, therefore, to explore some 
questions as to the organization of this external “real” physical world. Unless we 
know something of consensually validatable “information” that remains in- 
variant across transformations of the input to the brain—and, as we have seen, 
we cannot rely only on the directness of our perceptual experience for this 
knowledge—how can we think clearly about what is being perceived? Questions 
as to the nature of the physical universe lie in the domain of the theoretical 
physicist. The science of physics has enjoyed unprecedented success not only in 
this century, but in the several preceding ones. Physicists ought to know 
something, therefore, about the universe we perceive; and, of course, they do. 
However, as we shall shortly see, the structure-distribution problem is as 
pervasive here as it is in brain function. 

The special theory of relativity made it clear that physical laws as conceived in 
classical mechanics hold only in certain circumscribed contexts. Perceptions of 
the Brownian “random” movements of small suspended particles, or of the paths 
of light coming from distances beyond the solar system, strained the classical 
conceptions to the point where additional concepts applying to a wider range of 
contexts had to be brought in. As in the case of direct perception, the laws of 
physics must take into account not only what is perceived, but the more 
extended domain in which the perception occurs. The apparent flatness of the 
earth we now know as an illusion. 

The limitations of classical physics were underscored by research into the 
microcosm of the atom. The very instruments of perception and even scientific 
observation itself became suspect as providing only limited, situation-related 
information. Discrepancies appeared, such as an electron being in two places 
(orbits) at once or at best moving from one place to another faster than the 
speed of light-the agreed-upon maximum velocity for any event. And within the 
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nucleus of the atom matters are worse—a nuclear particle appears to arrive in one 
location before it has left another. Most of these discrepancies result from the 
assumption that these particles occupy only a point in space—thus when the 
equations that relate location to mass or velocity are solved, they lead to 
infinities. Furthermore, in the atomic universe, happenings take place in jumps— 
they appear to be quantized, that is, particulate. Yet when a small particle, such 
as an electron, or a photon of light, passes through a grating and another particle 
passes through a neighboring grating, the two particles appear to interact as if 
they were waves, since interference patterns can be recorded on the far side of 
the gratings. It all depends on the situation in which measurements are made 
whether the “wavicle” shows its particle or its wave characteristics. 

Several approaches to this dilemma of situational specificity have been for- 
warded. The most popular, known as the Copenhagen solution, suggests that the 
wave equations (e.g., those of Schrödinger, 1935, and de Broglie, 1964) describe 
the average probabilities of chance occurrences of particulate events. An earlier 
solution by Niels Bohr (the “father” of the Copenhagen group, 1966) suggested 
that particle and wave were irreconcilable complimentary aspects of the whole. 
Heisenberg (1959) extended this suggestion by pointing out that the whole 
cannot in fact be known because our knowledge is always dependent on the 
experimental situation in which the observations are made. Von Neumann 
(1932) added, that given a positivistic operational framework, the whole reality 
becomes therefore not only unknown but unknowable. Thus the whole becomes 
indeterminable because we cannot in any specific situation be certain that what 
we are observing and measuring reflects “reality.” In this sense, as well as from 
the viewpoint of brain processes, we are always constructing physical reality. 
The arguments of the quantum physicist and those of the neurophysiologist and 
psychologist of perception are in this respect identical. 

But several theoretical physicists are not satisfied with these solutions or lack 
of solutions. Feynman, Leighton, and Sands (1965), for instances, note that 
though we have available most precise and quantitative mathematical descrip- 
tions in quantum mechanics, we lack good images of what is taking place. (His 
own famous diagrams show time flowing backwards in some segments!) De 
Broglie (1964), who first proposed wavelike characteristics for the electron fails 
to find solace in a probabalistic explanation of the experimental results that led 
him to make the proposal. And de Broglie is joined by Schrödinger (1935), who 
formulated the wave equation in question and especially by Einstein, whose 
insights led him to remain unconvinced that an unknowable universe, macro and 
micro, was built on the principle of the roulette wheel or the throw of dice. 

I share this discomfort with attributing t00 much to chance because of an 
experience of my own. In the Museum of Science and Industry in Chicago, there 
is a display which demonstrates the composition of a Gaussian probability 
distribution. Large lead balls are let fall from a tube into an open maze made of 
a lattice of shelves. The written and auditory explanations of the display 
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emphasize the indeterminate nature of the path of each of the falling balls and 
provide an excellent introduction to elementary statistics. However, nowhere is 
mention made of the symmetrical maze through which the balls must fall in 
order to achieve their probabilistic ending. Having just completed Plans and the 
Structure of Behavior (Miller, Galanter, & Pribram, 1960), I was struck by the 
omission. In fact, students of biology routinely use statistics to discover the 
orderliness in the processes they are studying. For example, when a measurable 
entity shows a Gaussian distribution in a population, we immediately look for its 
heritability. Perhaps the gas laws from which statistics emerged have misled us. A 
Gaussian distribution reflects symmetrical structure and not just the random 
banging about of particles. Again, the physical reality behind the direct percep- 
tion may contain surprises. 

Moreover, when we obtain a probabilistic curve, we often refer to a distribu- 
tion of events across a population of such events—that is, a Gaussian dis- 
tribution. Could it be that for the physical universe, just as in the case of brain 
function, structure and distribution mutually interact? After all, the brain is a 
part of the physical universe. For brain function, we found structure to be in the 
form of program and distribution in the form of holograms. Is the rest of the 
physical universe built along these lines as well? 


THE STRUCTURAL AND HOLONOMIC ASPECTS OF ORGANIZATION 


David Bohm (1957), initially working with Einstein, has among others, made 
some substantial contributions to theoretical physics compatible with this line of 
reasoning. Bohm points out, as noted above, that the oddities of quantum 
mechanics derive almost exclusively from the assumption that the particles in 
question occupy only a point in space. He assumed instead that the “wavicle” 
occupies a finite space which is structured by subquantal forces akin to electro- 
magnetic and gravitational interactions. These interacting forces display fuctua- 
tions—some are linear and account for the wave form characteristics of the space 
or field. Other interactions are nonlinear (similar to turbulence in fluid systems) 
and on occasion produce quantal events. In biology, Thom (1972) has developed 
a mathematics to deal with such occurrences in the morphogenetic field and this 
mathematics has been applied to perception by Bruter (1974). Thom calls the 
emergence of quasi-quantal structures from turbulant processes “catastrophes.” 
In physics, the quantal structures that result from such catastrophic processes 
may, therefore, be only partially stable. Thus, they can disappear and reappear 
nearby in a seemingly random fashion, which, on the average, however, are 
subject to the more regular oscillations of the forces. In biology, observations 
pertaining to the entrainment of oscillatory processes by clocks or temporary 
dominant foci parallel these concepts. Bohm goes on to point out where in the 
subquantal domain these events will become manifest: the interactions of 
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high-frequency and high-energy particles in nuclear reactions, in black bodies, 
etc. An article in a recent issue of Scientific American reviews the contemporary 
scene in these attempts at a Unified Field Theory in the subquantal domain 
(Weinberg, 1974). 

More recently, Bohm (1971, 1973) has reviewed the conceptual development 
of physics from Aristotelian through Gallilean and Newtonian times to modern 
developments in the quantum mechanics. He points out how much of our image 
of the physical universe results from the fact that, since Gallileo, the opening of 
new worlds of inquiry in physics has depended on the use of lenses. Lenses have 
shaped our images and lenses objectify. Thus we tend to assess external space in 
terms of objects, things and particulars. 

Bohm goes on to suggest that image formation is only one result of optical 
information processing and proposes that we seriously consider the hologram as 
providing an additional model for viewing the organization of physical processes. 
He and his group are now engaged in detailed application of this basic in- 
sight to see whether in fact a holographic approach can be helpful in solv- 
ing the problems of high-energy nuclear physics. Initial developments have 
shown promise. 

As noted above, the subquantal domain shows striking similarities to holo- 
graphic organization. Just as in the case of brain processes presented here, 
Bohm’s theoretical formulations retain classical and quantum processes as well as 
adding the holographic. The holographic state described by wave equations and 
the particle state described quantally, are part of a more encompassing whole. 
The parallel holds because the holographic models describe only the deeper 
levels of the theory which is thus holonomic, rather than holographic, as we 
found it to be for the special case of brain function (where the deeper level is 
constituted of pre- and postsynaptic and dendritic potentials and the quantal 
level, of the nerve impulses generated by these slow potentials). 

Bohm relates structural and holographic processes by specifying the differ- 
ences in their organization. He terms classical and particle organization explicate 
and holographic organization implicate. Elsewhere (Pribram, 1971), I have 
made a parallel distinction for perceptual processes: following Bertrand Russell 
(1959), I proposed that scientific änalysis as we practice it today, begets 
knowledge of the extrinsic properties (the rules, structures, etc.) of the physical 
world. My proposal departs from Russell’s, however, in suggesting that intrinsic 
properties (e.g., which he defines as the stoneness of stones) are also knowable— 
that in fact they are the ‘ground’ in which the extrinsic properties are embedded 
in order to become realized. Thus artists, artisans, and engineers spend most of 
their time realizing the extrinsic programs, laws and rules of the arts and sciences 
by grounding them in an appropriate medium. For example, a Brahms sym- 
phony can be realized by an orchestra, on sheet music, on a long-playing record, 
or on tape. Each of these realizations come about after long hours of develop- 
ment of the medium in which the realization occurs. Russell was almost correct 
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in his view that the intrinsic properties of the physical world are unknowable— 
they have apparently little to do with the more enduring extrinsic properties, 
show no resemblances among themselves, and demand considerable knowhow to 
replicate. 

The sum of these ideas leads to the proposal that the intrinsic properties of the 
physical universe, their implicate organization, the field, ground or medium in 
which explicit organizations, extrinsic properties, become realized, are multi- 
form. In the extreme, the intrinsic properties, the implicate organization, is 
holographic. As extrinsic properties become realized, they make the implicate 
organization become more explicit. 

The consequence for this view is a reevaluation of what we mean by probabilis- 
tic. Until now, the image, the model of statistics, has been indeterminacy. If the 
above line of reasoning is correct, an alternate view would hold that a random 
distribution is based on holographic principles and is therefore determined. The 
uncertainty of occurrence of events is only superficial and is the result of 
holographic “blurring” which reflects underlying symmetries (much as does the 
Gaussian distribution in our earlier example) and not just haphazard occur- 
rences. This relation between appearance and reality in the subquantal domain 
of nuclear physics and its dependence on underlying symmetries (spin) is 
detailed in the review article in Scientific American already referred to (Wein- 
berg, 1974). 

A preliminary answer to the question posed at the outset of this section—what 
is it that we perceive?—is therefore that we perceive a physical universe not 
much different in basic organization from that of the brain. This is comforting 
since the brain is part of the physical universe as well as the organ of perception. 
It is also comforting to find that the theoretical physicist working from his end 
and with his tools and data has come to the identical problem (which is, in 
Gibson’s terms, the nature of the information which remains invariant across 
situations) faced by the neurophysiologist and psychologist interested in percep- 
tion (Bohm, 1965, Appendix). Though surprising, the fact that at least one 
renowned theoretical physicist has made a proposal that addresses this common 
problem in terms similar to those set forth on the basis of an analysis of brain 
function is most encouraging. For science is of a piece, and full understanding 
cannot be restricted to the developments made possible by one discipline alone. 
This is especially true for perception—where perceiver meets the perceived and 
the perceived meets the perceiver. 
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PART I: THEORETICAL MOTIVATION 


The Problem of Object Perception 


Typically, objects are best recognized by their shape. For this reason philoso- 
phers have considered shape to be a primary quality of objects while color has 
been deemed a secondary quality. But what is the shape of an object by which it 
might be recognized? Traditionally, theorists have identified the shape of objects 
with their spatial forms given in terms of their metric Euclidean description. In 
order for these spatial measurements to be useful, it is further assumed that the 
object must be rigid; in other words, that its shape does not change when 
displaced. For if shape were not invariant under displacement, no object would 
look the same when moved to a new place or even when reoriented in its old 
place. 

Following in this tradition many psychologists have assumed that the percep- 
tual information by which objects are recognized is identical with the optical 
information specifying its rigidity. Perhaps this is why nearly all experiments 
in the area of visual perception have used rigid objects as their stimuli, 
thereby tacitly endorsing the view that object perception is tantamount to the 
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perception of the rigid shape of objects. Such theorists hold that the perceptual 
information by which objects can be identified is isomorphic to the exact 
physical measurements of the object. This view has been termed the “con- 
stancy” hypothesis (Hochberg, 1964) (not to be confused with the phenomenon 
of perceptual constancies). Unfortunately, this hypothesis is false. However, 
faith in the validity of the constancy hypothesis, as part of the heritage of 
naive realism, is so wide spread that it is worth reviewing some of the evidence 
against it. 

Specifically, the constancy hypothesis is inconsistent with several well-known 
phenomena. For instance, the so-called “geometric” illusions, such as the Mül- 
ler-Lyer, Ponzo, Poggendorf, Wundt, and Herring illusions all provide counter 
evidence to the claim that perceptual information is isomorphic with the exact 
physical description of objects. Physically straight lines may appear bent, parallel 
lines may appear nonparallel, objects of equal size may appear larger or smaller 
than one another, and well-aligned objects may appear misaligned. 

To put the matter a little differently, there is abundant evidence that the 
geometry of perceptual space is not in perfect agreement with the classical 
geometry of physical space. This, of course, does not imply that objects are not 
recognized primarily in terms of their shape, but rather that we need a more 
adequate theory of what is meant by the shape of an object. In our opinion, the 
uncritical acceptance of the metric Euclidean ideal for shape and the classical 
view of perceptual space has thwarted attempts by most theorists to explain 
object perception, or as it is sometimes called, “form” or “pattern” perception. 
It is our contention that a less restrictive view of the nature of perceptual space 
is needed; one that construes the concept of shape more broadly. It is our 
hypothesis that a more adequate concept of shape and, consequently, a better 
understanding of object perception might emerge from studies which use stimu- 
lus materials other than simple, static, rigid objects. The studies reported in Part 
II of this chapter satisfy these guidelines. 


Toward a More Adequate Conception of Shape 


The constancy hypothesis was designed to explain perception of the physical 
world. Much to the chagrin of the naive realist, however, the very existence of 
geometric illusions indicates the utter failure of the hypothesis to even explain 
the perception of rigid geometric figures. To make matters still worse for the 
hypothesis, the natural world, as normally perceived, is replete with many 
identifiable shapes that are not rigid. Obviously, the constancy hypothesis is 
even less adequate for explaining the perception of these shapes. Consider the 
fact that natural objects which grow do not retain their rigid geometric forms 
but may still be recognized as the same object at different stages of growth. For 
example, we are often quite able to recognize persons who we have not seen 
for a long while, in spite of the fact that they may have grown to maturity, 
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grown fatter or slimmer, Similarly, we are able to identify the species of plants 
and animals at different stages of growth. The incredible property of growth 
transformations, unlike simple displacements, is that they radically alter the 
Euclidean shapes of objects in a nonrigid fashion and yet do not destroy their 
structural identity. Thus, in spite of their somewhat elastic nature, growing faces 
somehow still preserve sufficient information in their nonrigid shapes to specify 
their species, race, sex, and individual identify. 

When an artist caricatures a face, he applies nonrigid transformations to it. Not 
only do such transformations not destroy the perceived identity of the face, 
they frequently serve to enhance it. Ryan and Schwartz (1956) demonstrated 
that caricatures of objects are often more easily recognized than either photo- 
graphs or veridical line drawings of the objects. Gardner and Wallach (1965) 
similarly were able to show that profile silhouettes of babies and adults could be 
transformed so as to be judged more “babyish” or adult-like than normative 
representations. They likened these idealizations to what ethologists have called 
“supernormal” stimuli. Ethologists have shown in a variety of species that highly 
caricatured representations of natural prey, predators, rivals, mates, etc. are 
sometimes more likely to release inherited fixed-action patterns of behavior than 
the real object. 

In all these cases, nonrigid transformations preserve OT even enhance the 
perceptual recognition of objects. This implies that the equivalence classes of 
psychologically effective stimuli cannot be captured in the traditional Euclidean 
theories of perceptual space. In the next section we sketch the theoretical and 
methodological considerations that have guided our research into the problem of 
object perception. 


The Nature of Perceptual Space 


provide a suitable theory of 
Before this question can be 
what properties such a theory 
deration will be to provide a 
fferent from the traditional 
try. Secondly, we will not 


If classical metric Euclidean geometry fails to 
perceptual space, then what geometry might? 
properly addressed it would be useful to consider 
of perceptual space must possess. Our first consi 
conceptualization of a space in general that is di 
conception taught us by classical Euclidean geome 
assume a priori that there necessarily exists only a single geometry for perceptual 
space, that is, we will allow for the possibility that the structure of perceptual 
space may at any given time conform to the geometry of the stimuli being 
perceived. i 

Although this is surely a radical hypothesis, it seems to us still more apni 
tive than the postulation of a unique structure to perceptual space regardless o! 
what is being perceived as traditionally has been done. By assuming that 
perceptual space may be flexible we make the determination of its geometry on 
any particular occasion an empirical task. Whether we can ultimately discover a 
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single unified theory of perceptual processing is both a theoretical as well as an 
empirical challenge that cannot be adequately dealt with by simplistic assump- 
tions regarding the possible nature of perceptual space. A few background 
remarks about revolutionary revisions in the concept of physical space developed 
in this century will provide the requisite context for understanding what we 
believe might be the nature of perceptual space. 

A geometry can be thought of as a theory of a space and, conversely, for each 
possible concept of space there must be a geometry. Newton’s space for classical 
physics was absolute in the sense that it was a locus in which rigid objects could 
be placed. Einstein rejected this notion of physical space in favor of one 
proposed a half century earlier by Riemann. Riemann’s concept of space was not 
absolute, not merely “a locus in which... ,” but, rather, he considered space 
relative in the following sense: Space itself has a structure. Its structure is 
induced from the “stuff” it contains. Change the material in the space and you 
change the structure of the space itself. For example, the amount of mass in a 
region of space determines the degree of curvature of space in that region. 
Strictly speaking, in such a space there is but one object, namely, the space 
structure as a whole. What are individual objects in space for Newton become 
local aspects of the structure of Riemann’s space—a field notion (Bell, 1949). 

We can now state the hypotheses regarding shape which will guide our research 
on the nature of object perception: (a) the perceived shape of an object is due to 
the structure of the space to which it belongs; and (b) the structure of 
perceptual space is dependent upon the structure of the objects perceived as well 
as upon the structure of the perceptual systems. 

As unusual as these hypotheses might seem, there is considerable evidence to 
support them. For instance, regarding the first hypothesis, it appears that 
geometric illusions can be explained by the fact that the objects (lines) in the 
space induce a structure on the whole space. Consider a pair of parallel lines 
tuled on white paper. They look parallel until a fan pattern of radial lines is 
drawn around them (Herring’s or Wundt’s illusion); two circles appear equal in 
size until two converging lines are drawn around them (Ponzo illusion); or two 
lines appear of equal length until oppositely directed pairs of fins are added to 
each (Miiller—Lyer illusion), Hence, changing the contents of a region of space 
changes its structure. In all such cases, the difference between the physical 
objects and the perceived objects argues that perceptual space is nonrigid and 
resilient, possessing an overall structure with field properties analogous to what 
Einstein argued for physical space. 

Support for the second hypothesis, that perceptual space is structured by the 
objects perceived, can be found in two lines of research. Moore and Parker 
(1972), for instance, have programmed a pattern processing model which “‘sees” 
geometric illusions apparently as people do. When given, say, the Müller-Lyer 
figures as input instead of computing the exact physical measurements showing 
the equal figures to be equal, it computes a discrepancy in length of the same 
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magnitude as shown in the judgment of people. In a Master’s thesis under the 
direction of the principal author, Johnson (1973) was able to show that this 
effect arises because the model samples the continuous line drawings in a 
discrete manner by means of a parallel processor not unlike the retinal mosaic of 
the vertebrate eye. The hypothesis that the discretization of visual inputs by the 
nervous system of organisms causes such errors in perceptual processing suggests 
that illusions should be seen by nonhumans as well. Indeed, Revesz (1934) 
found that chicks were susceptible to the Jastrow illusion. If this hypothesis is 
valid, then the existence of geometric illusions provides evidence that perceptual 
space can be structured by intrinsic properties of the central nervous system 
(e.g., its discrete functioning). 

Furthermore research by Piaget and Inhelder (1956) and others (Laurendeau & 
Pinard, 1970; Pufall & Shaw, 1973) on the ontogenesis of perceptual space in 
the child demonstrates that experience can indeed alter the structure of per- 
ceptual space. Piaget argues that the child progresses from a concrete “action” 
space to a representational one as the symbolic function of intelligence develops. 
Borrowing the notion of a hierarchy of geometries from Klein, Piaget maintains 
that from roughly 2 to 12 years of age the child’s concept of perceptual space 
progressively develops from topology to Euclidean geometry. Therefore, the 
view that seems best supported is that perceptual space for the child, no less 
than for the adult, is a family of spaces with no single geometry. Perhaps this is 
why no significant progress has been made as yet in relating the areas of object 
perception and space perception. Indeed, the proposed view of perceptual space, 
in following Riemann’s concept of space, collapses both areas into a single one, 
where object perception is but the study of local regions of a more general 
spatial structure. 

Thus, while there is good reason to agree with the Gestaltists’ claim that the 
classical view of perceptual space as an inert, absolute space is woefully inade- 
quate and must be rejected, we need not revert to their view of perceptual space 
as a field of mysterious forces in the cortex where isomorphic representation of 
physical objects act upon each other. On the other hand, James J. Gibson’s 
(1966) theory of information pickup offers the basis from which we might 
propose a better alternative. Let us assume with Gibson that perceptual experi- 
ences belong to an information space, rather than to a field of mysterious 
cerebral forces as the Gestaltists did. Consequently, it seems likely that through 
evolution the perceptual systems have become intrinsically structured so that 
animals and humans are preattuned at birth to pickup just those dimensions of 
physical stimulation that convey ecologically significant invariants of informa- 
tion. Furthermore, there is ample evidence that the senses are not only gene- 


tically preattuned but become more sensitively calibrated to pickup those 
at bear directly on the survival, success and 


exigencies of the environment th ; 
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In addition to the intrinsic source of structualization of perceptual space by 
evolution and experience, there are extrinsic sources, namely, those properties of 
the environment and its contents which structure the energy distributions 
picked up. Before attempting to characterize more precisely how perceptual 
space becomes structured, let us explain the conundrum that the geometrical 
approach avoids. 


Perceptual Theory as Ecological 
Rather Than Empirical or Phenomenological 


Traditionally, no perceptual theory has successfully explained the relation of 
things experienced to things merely described. The first has to do with the 
phenomenology of perception while the second has to do with physicalistic or 
behavioristic descriptions of the conditions underwhich perceptions arise. Since 
our main goal in science has been to provide descriptions of phenomena capable 
of public corroboration, we have elected to eschew the more private or subjec- 
tive aspects of experience in favor of a phenomenology-free language—at least so 
far as possible. The difference between an empirical science and a phenomeno- 
logical one is that the former describes the conditions under which an observa- 
tion or measurement might be successfully carried out ceteris paribus while the 
latter attempts to describe the experience of observing per se. Herein lies the 
fundamental difficulty of perceptual theory because such a theory, unlike 
physics, chemistry or biology, has as its raison d’étre the scientific explanation 
of the relationships between the percept directly experienced (and hence phe- 
nomenological) and the object described from observation (and hence empiri- 
cal). By our tacit attempts to pattern psychology after classical physics, we have 
saddled ourselves with a theoretical language which by design is inappropriate to 
our task. An empirically biased theoretical language which can describe only the 
antecedent conditions for perceiving but never the experience itself. Hence the 
meaningfulness of percepts from the first person perspective is lost. 

Psychology is the juncture between science and epistemology; for it is here 
that the issue concerning what is known by description makes contact with the 
issue concerning what is known by acquaintance. Whatever else the so-called 
“mind-body” problem might be, it is surely perpetuated by this bifurcation in 
language whereby talk about the phenomenological consequents of perception 
belong to a separate mode of discourse than talk about the empirical antecedents 
of perception. This is why, as psychologists, we are much better able to provide 
a scientific description of the conditions under which a percept is likely to be 
experienced than we are a description of the perceptual experience itself, witness 
the failure of introspective psychology. 

The modern forms of the causal theory of perception which have been 
adopted by consensus by both psychologists and philosophers fails to address 
the problem of perceptual experience at all, but merely postulates an identity 
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between states of the brain and concomitant perceptual experiences. The recalci- 
trant mystery of this correspondence is apt to be swept under the rug. The 
language of perceptual theory tends to become “squishy” at this juncture where 
experience and the causal support for experience are semantically amalgamated. 

Our aim, however, is not to rehash this old chestnut but to suggest that as 
psychologists we give it a decent burial. With respect to the attempt to resolve 
the mind-body dilemma by seizing it by either horn, you are damned if you do 
and damned if you don’t. Therefore, let us choose the third way of resolving it 
and pass between the horns if we can; let us be no more phenomenologists than 
we are physicalists or behaviorists. Rather we need a third mode of discourse 
for perceptual theory which subsumes both of the other perspectives under a 
common rubric, and treats the object of perception and the perceptual experi- 
ence inter se. 

The distance separating the object to be perceived from the percept of the 
object is not truly spatial and, therefore, can not be bridged by causal sequences 
which unite physical objects with their nonphysical counterparts, the perceptual 
experiences, through the action of energy propagations on neural signals. The 
theory of the causal chain is science of the impersonal pronoun; science done in 
the third person, for example, “Jt causes it to be transformed to its new state 
from which it gives rise to its new effect.” The impersonal pronoun it is used as 
subject of a verb that expresses a condition or action when no reference to an 
agent is to be made. No wonder that perceptual experiences, which by their very 
nature can not be impersonalized but must belong to some agent, are emascu- 
lated of all meaning when science is carried out in this mode of discourse. On the 
other hand, phenomenology is theory carried out in the first person singular, for 
example, “I experienced X when perceiving under such and such conditions.” 

To go from one mode of discourse to the other in order to provide a spurious 
causal analysis of perception is an egregious error, for it implies that the agentive 
mode of discourse in the first person singular is logically equivalent to the non- 
agentive mode of discourse in the impersonal third person. No amount of hand- 
waving can equate it with J. (For a fuller discussion of the need of agent-like 
constructs in psychology, see Shaw & McIntyre, 1975.) 

What psychology needs is a theoretical language that somehow incorporates 
both the objectivity provided by physicalistic science and the person orientation 
of phenomenology. Such a language, ideally, would be ecological in the sense 
that it would be publically confirmable and yet would retain concepts that are 
meaningful with reference to agents. Following the above argument, an ecologi- 
cally based theoretical language would be in the first person plural mode of 
discourse. For example, “If we agents, with the same attunements, pickup 
information made available by such and such a physical display, then we will 
typically have the same experience X.” Such a theoretical language is well suited 
to explaining social agreement in perceptual experiences—the data foundation 


upon which the field rests. 
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Consequently, we propose as a candidate a geometric language wherein the 
relationship between percepts and objects perceivable are denoted as primitive 
terms. Such dyads are sufficient to provide the variables over which all knowl- 
edge relations might be constructed. In such a language perceptual space is an 
ecological information space since by a percept we no longer mean just a 
subjective experience nor a chain of states in a biological “it.” Rather a percept 
refers to the state of an agent who is in the act of picking up information made 
available by events in the environment that specifies the potential value (afford- 
ances) those events have for the satisfaction of present or future actions. 

In short, the concept of the affordance value of perceptual information, as 
proposed by Gibson (see his chapter, this volume), is the tertium quid needed to 
solve the problem of perception, Consequently, the problem to be addressed by 
perceptual theory is three fold: First, what is the nature of the information 
abstracted from physical displays (e.g. objects, events) in the world. Secondly, 
how is the abstraction process carried out? And, finally, for whom is the 
information intended? Is it intended for a hungry person, a person who wishes 
to stand, a bored person, a person who intends to hit a bullseye with an arrow, 
etc. 

By construing the problem of perception (and action) in the first person plural 
we automatically include the reference criteron “we” to whom public conforma- 
tion of all theoretical claims must ultimately be addressed. Hence, the ecological 
approach has the same built-in social objectivity of an empirical approach. But 
more importantly, given our recent flirtation with behaviorism, the “we” of the 
ecological approach retains its phenomenological reference so that the relative 
value of perceptual information is measured against the needs and intentions of 
the perceiving agent. 

Under this view, the fact that information has an emotional appeal or an 
affective color is retained as an indispensable datum for experimental psy- 
chology. Moreover, so far as agents are comparably attuned and the perceptual 
situation analogous, then the meaning and value of the information for these 
agents can be objectively determined. 


The Geometric Approach to Perceptual Theory 


Currently, no single perceptual theory exists which encompasses both the 
intrinsic factors of perception arising from the design and attunement of the 
perceptual systems as well as the extrinsic factors contributed by the objects and 
events perceived. A significant reduction in the scope of the problem and 
considerable more precision might be achieved by a “geometrization” of the 
dual factors involved. Likewise, this approach may be used to raise the concept 
of perceptual information to the level of meaningful information which is both 
knowable directly by acquaintance and not just describable in abstract, valueless 
terms. What is required is a means of relating the experience of percepts to the 
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information liberated from objects and events in the world by an active, 
investigative, perceiving creature. 

In an attempt to cut the Gordian knot of the mind-body conundrum, we offer 
the following postulate for perceptual theory: 


The act of picking up perceptual information is the act of experiencing it; the 
information abstracted from the objects, object complexes and events of the 
world are intrinsically valuable to the perceiving agent since, by definition, such 
information specifies the nature of its source and the active relationships the 
animal or human perceiver might enter into with respect to that source. 


This is, of course, just a restatement of Gibson’s claim that the information 
made available by the world is information that specifies what that world affords 
a given observer for good or ill. This is what it means to base perceptual theory 
on an ecological physics, that is, a physics described in terms selective of the 
attributes of the terrestrial environment that are most relevant to the successful 
existence of a perceiving animal or human. (Economics, political science, soci- 
ology, and anthropology are already ecologically relevant disciplines in this 
sense.) 

By analogy to Newton’s classical problem in physics regarding the action-at-a- 
distance achieved by one object on another through gravitational attraction, we 
have in psychology the action at a distance of objects and events in the physical 
environment on the psychological states of a person or animal. Using mathemati- 
cal techniques developed by Minkowski, Einstein was able to rid physics of 
mysterious forces acting over empty distances that separate objects. He ac- 
complished this feat by replacing the empty distance between objects with the 
structure of space itself in the manner suggested by Riemann’s global approach 
to geometry. Thus, the mass of one body was now to be thought of as 
penetrating the mass of another body as ripples of a distant disturbance on the 
surface of a pond might interpenetrate and affect one another. Like the pond, 
space itself provides the support for the disturbing effects of bodies placed in 
close proximity as well as for the warping of orbits of planets awash with the 
motions of large masses as they run their distant courses. 

A geometrization of perceptual theory might prove similarly beneficial in our 
efforts to clarify the perceptual effect of environmental bodies on conscious, 
sentient beings. Without the precision of geometric characterization, like New- 
ton’s gravitational attraction, perceptual information remains a mysterious force 
by which changes are effected across the distances that separate perceiver from 
that perceived. True, the physical distance that separates the body of the 
knowing-agent from object-known is filled with a flux of energy (light, sound, 
odors, etc.), but in what sense is it “filled” with information? 

The space of perceptual information is no more the space of physics than it is 
the subjective space of phenomenological experience. Both such spaces are 
constructs abstracted from the more complex space of relationships that bind a 
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knowing agent to the objects known. This abstraction is all the physicists or the 
phenomenologist knows and what the psychologist must dutifully acknowledge. 
Where the two complementary disciplines arise from denial of another half of the 
dyadic epistemic relation, psychology must begin with a more balanced view. As 
psychologists, we do not abstract solely from either term—the physical space or 
phenomenological space—rather we abstract from the relations defined by the 
product of the two spaces. The space of perceptual information is this new 
product space—an amalgamated spatial continuum not unlike the Einstein- 
Minkowski amalgamation of the three dimensions of physical space with the 
independent dimension of time. 

The invariants of perceptual information thus become the invariants of the 
higher dimensional continuum and the primitive correspondence upon which 
knowledge rests. In this sense, the geometrization of perceptual theory achieves 
a mergence of psychology with epistemology. 

In Part II we attempt to illustrate this geometric approach to the problem of 
perceiving faces as a special case of perceiving objects which are not rigidly inert 
in shape but undergo dynamic change due to growth. Our tactic is to discover 
the relevant invariants of perceptual information that specify aging, a significant 
ecological dimension of change, by discovering a geometry for growth defined in 
terms of transformations on structures that are potentially perceivable. We then 
carry out experiments to show that the postulated geometry of growth space 
does indeed specify the perceived changes in objects placed in that space, that is, 
the objects appear to age. 

Thus, the geometry of growth space that we attempt to give is not just one that 
captures biological aspects of change but, more importantly for our purposes, it 
is one that captures the perceptual information that humans (and, perhaps, even 
animals) experience. The ecological validity of the tasks performed by our 
subjects in these experiments seems to us well indicated by the ease and 
naturalness with which they were performed. 


PART II: THE GEOMETRY OF GROWTH— 
A THEORY OF PERCEPTUAL INFORMATION FOR AGING FACES 


Introduction 


A geometry is a set (usually a group) of transformations which leave certain 
properties of the structure of a space invariant. In fact, under the Riemannian 
view of space, what we mean by a geometric object is just such a set of 
properties left invariant by the transformations. For instance, in metric-Eucli- 
dean geometry, the geometry taught us in secondary school, shape is equivalent 
to the property of rigidity belonging to all geometric objects. By a rigid object 
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(or shape) we mean one such that the distance between any pair of points on its 
surface remains unchanged when the object is transformed by displacement or 
reflection. A transformation that leaves a property (for example, rigidity) 
invariant is called a symmetry operation or transformational invariant of the 
space (for example, rotation). The full set (or group) of transformations which 
constitute symmetry operations for rigid objects (that is, leaves their shape 
unchanged) is exactly what we mean by Euclidean geometry. Rotation, transla- 
tion, and reflection are the transformational invariants of Euclidean space 
(sometimes called the Euclidean group). The properties which remain invariant 
under the geometric transformations are usually called the symmetries or struc- 
tural invariants of the space. For instance, rigidity is a symmetry or structural 
invariant of Euclidean space. Hence when you know the transformational 
invariants (symmetry operations) and the structural invariants (symmetries) of a 
given space, then you know all there is to know about that space, that is to say, 
you have a geometry or theory of the space. 

The important point to be underlined from the above discussion is that the 
concept of invariant has two meanings: On the one hand, it can mean a 
symmetry operation (transformational invariant) while on the other it can mean 
a symmetry of the space (structural invariant). To anticipate what is to come; it 
is our intention to show how Gibson’s intuitive concept of invariant perceptual 
information can be rigorously characterized in terms of these two symmetry 
concepts. By doing so we will achieve nothing less than a theory of perceptual 
information. Or to put the matter differently, such a theory will qualify as a 
geometry of perceptual space. Consequently, to illustrate this approach we will 
concentrate on the development of a theory for the perception of objects. 


Toward a Revised Concept of Shape 


As pointed out above, the concept of shape corresponds to just those properties 
left invariant under the transformations associated with the geometry. As we 
relax these properties by introducing symmetry-breaking operations into the 
geometry, we weaken the concept of geometric shape (structural invariants) by 
reducing the number of symmetries of the space. In this way a hierarchy of 
geometries (or spaces) can be defined which range from the most restrictive, 
metric Euclidean geometry, to the least restrictive, topology. Moreover, the 
concept of shape by which objects in each of these geometries may be distin- 
guished or recognized as belonging to the same equivalence class, is relative to 
those properties of the space left invariant under the transformations defining 
the given geometry. Let us briefly survey what becomes of the concept of shape 
as we descend from the highest to lower geometries on this hierarchy. 

For metric Euclidean geometry, the shape of an object is identical to the 
property of rigidity. You will recall from the earlier discussion, that this 
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symmetry of Euclidean space simply means that the distance between any pair 
of points, x, — x2 = d, where x; refers to the abscissa of the point, remains 
invariant under the symmetry operations on the space (e.g. rotation, translation, 
or reflection). Surely, this concept of shape is very useful and a salient recourse 
for identifying objects. But it is by no means adequate for recognizing the 
structural equivalence of objects when they have been nonrigidly transformed, 
say by a change in their size (similar objects). Consider, for instance, the fact 
that a toy car is still recognized as a car in spite of its diminished size, just as 
giant’s cap would still be recognized as belonging to the class of objects found in 
your local haberdashery. Thus, size invariance does not seem to be a necessary 
condition for our recognition of objects. Technically speaking, a rigid object can 
not change size since this requires a stretching or compressing simultaneously in 
all directions. Such a size transformation that leaves the shape of the trans- 
formed object similar to that of its original form is called a radial transforma- 
tion, homethetic transformation or, simply, a transformation of similitude. By 
augmenting the Euclidean group of transformations with this size transformation 
a new geometry is created by violating or breaking a symmetry of the space. 
This, in turn, defines a space in which the congruence of rigid objects is but a 
special case of objects whose shapes are similar. Objects in such a space are said 
to have similar shapes if a ratio of similitude exists which maps one object into 
another, that is, if x; —x = k(x, —x2) holds for every point on each object, 
where x; refers to the abscissa value of the point corresponding to x; on the 
original object that has been transformed by a positive integral value k. 

Hence, the ratio of similitude is the structural invariant corresponding to the 
concept of the shape of an object in similarity space. Let us proceed to relax the 
symmetry properties of the space even further by introducing still more sym- 
metry-breaking operations into the original set of Euclidean transformations. 

A table remains the same table if a leaf is inserted so as to extend its length. A 
man on stilts is still recognizable as the same person in spite of his elongated 
height. A cube of clay is still the same mass if compressed or stretched into a 
rectangular shape. All of these transformations in the shape of an object, 
whether accompanied by size transformations or not, do not destroy completely 
the aspects of shape by which they may be recognized as a transformed object 
rather than a new object. 

The geometry which results from a strain operation that linearly stretches or 
compresses an object in but one direction at a time is called affine geometry. 
Examples of other objects with affine-equivalent shapes are squares and rec- 
tangles, on the one hand, and circles and ellipses on the other. A second affine 
transformation is a shear which takes the angles of an object into new angles. 
Shearing the angles of a square transforms it into a rhombus or shearing the 
angles of a rectangle transform it into a parallelogram. Such affine-equivalent 
shapes are not similar since the distances among points in all directions do not 


5. PERCEIVING THE FACEOF CHANGE 115 


remain proportional. Rather a shape invariant is created; one that must satisfy a 
ratio of division, that is, where the relationship 

X X _ X17 Xa 

X2 X3 x2 —X5 


holds among three collinear points on the original object and three collinear 
points on its transform. The flattening out of shapes in photographs made with a 
high-power telescopic lens is an example of a affine transformation. Still we find 
buildings and people photographed under such conditions readily recognizable. 
Consequently, information specifying shape must be still more abstract than that 
proposed by either simple Euclidean or similarity geometry. But can some 
semblence of the intuitive concept of the shape of an object be retained in still 
weaker geometries? Let us proceed to consider the even less restrictive notion of 
projectively equivalent shapes. 

In projective geometry little is preserved to count as “shape” except those 
properties, such as linearity, which are not lost under central or parallel projec- 
tion. Consider, for instance, the shapes of various shadows of objects: a 
trapezoidal shadow can be cast from a square object that is tilted with respect to 
the wall or screen used as the projective plane; similarly, a trapezoidal shape can 
be tilted so as to cast a square or rectangular shadow. 

Thus, square, rectangular, and trapezoidal shapes are said to be projectively 
equivalent. Whereas affine transformations never destroy the parallelism of sides 
of an object, projective transformations may. The shape invariant preserved by 
projective transformation is called an anharmonic ratio or, simply, a cross-ratio. 
The cross-ratio relationship that must be satisfied by projectively equivalent 
shapes can be expressed as follows: 


(x —X4) @& —¥2) _ Ge X) 3 2) 

(x1 —x3) @3 —X4) (x1 x3) (x5 —x4) 
where the xis refer to four distinct collinear points and the primed symbols 
refer to points lying on the projectively transformed object. 

Before venturing further down the hierarchy of geometries to discover the 
weakest conception of shape information possible, let us pause to recognize an 
interesting trend in the nature of the shape invariant for each successively 
weaker geometry. The geometric invariants by which equivalent shapes are 
defined becomes increasingly more complex and abstract as we descend from 
higher to lower geometries. By casting the shape invariant for each of the 
geometries discussed above in the same form this trend becomes quite apparent: 


(1) Euclidean space: 


X,—X2, a distance invariant. 
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(2) Similarty space: 
k(x, —x2), a ratio of similitude. 


(3) Affine space: 


X1 —X2 a ene 
, aratio of division. 
X2 —X3 


(4) Projective space: 


(1 —x4) &3 — x2) 
(x1 —x3) %3 —x4) i 


Thus, we can see that the concept of equivalent shapes becomes more complex 
because the equation specifying the shape invariant involves an increasing 
number of terms to be evaluated as well as operations to be performed. Even 
more interesting for perception, however, is the fact that in each subsequently 
weaker space the information specifying the shape invariant becomes consider- 
ably more abstract in the technical sense that fewer and more subtle properties 
of the space remain invariant. This is obvious from the fact discussed earlier, 
namely, that to construct weaker geometries you augment the Euclidean group 
of transformations with an ever increasing number of symmetry-breaking or 
invariant-destroying operations. It must be the case, then, that the most liberal 
concept of shape should reside in the weakest geometry of all which lies at the 
lowest point of the hierarchy—in a topology. 

We will not attempt to survey all the geometries that can be defined by the 
proposed relaxation method of introducing new symmetry-breaking operations 
into Euclidean space. Our aim is not to be exhaustive, but to demonstrate that 
the natural ambiguities that inhere in our intuitive concept of shape can be 
expressed rigorously in terms of geometric symmetry theory. 

In the next section we explore the topological conception of shape that seems 
to capture best what is meant by the shape invariant preserved under the 
viscal-elastic change produced by the biological process of growth, 


a cross ratio. 


Toward a Geometry of Growth Space 


The final question that must be answered is whether or not a perceptually 
significant shape invariant exists in the weakest of all geometries—a topology. A 
topology consists of a set of transformations that map the set of points S of one 
object into a set of points S’ such that it is 1 — 1, continuous, and has a 
continuous inverse. The shape invariant for a topology is specified by a homeo- 
morphism. Homeomorphisms preserve connectivity, endlessness of a curve, the 
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property of being a closed curve, incidence relations, linear order, and cyclic 
order, 

The relevant question for perceptual theory, is whether there exist ecologically 
significant objects for which changes occur that are homeomorphic, that is, are 
specified by a topological shape invariant. Unlike projective transformations 
which preserve linearity, topological transformations take curved lines into 
straight lines and vice versa. Thus, if we can find an example of homeomor- 
phically equivalent forms that are perceptually equivalent shapes, then we will 
have evidence that a topological shape invariant is ecologically valid. The human 
face seems to be such a case. Consider the following description of the remodel- 
ing transformation imposed upon the face as it grows: 

The human face, like any living object, is radically changed by growth. Enlow 
(1968) describes growth as a “remodeling” process which not only changes the 
size of the object (a transformation of similitude) but its shape as well (a 
nonrigid transformation). For instance, at first the neonatal human head exhibits 
an exaggerated cranium and a diminutive face. Over time, the facial complex 
grows more rapidly than the skull so that the overall dimensions of the head and 
face become disproportionately altered. The most important fact of growth is 
that craniofacial growth, no less than that of other parts of the body, does not 
take place merely by enlarging or even by depostion of interpolated tissue layers. 
Hard bone, unlike soft tissue, is not capable of interstitial growth since these 
involve expansive changes of existing tissue. Consequently, growth follows a 
“remodeling” process by which new bone tissue successively forces relocation of 
all other parts of the craniofacial complex. Growth, then, can be defined as a 
global remodeling process which induces a “viscal-elastic” change over the 
craniofacial complex resulting in a change in both its shape and size. 

If growth is indeed a remodeling transformation, then it not only changes the 
size of the face over time but, more importantly, modifies the shape of the face. 
In spite of such gross changes, older faces are still readily recognized as being 
transforms of younger ones seen earlier rather than being new faces never before 
seen. Indeed, most adults have had the experience of recognizing people that 
they have not seen for a decade or so, in spite of the gross changes that may have 
occurred to their faces due to growth or weight change. It is also quite common 
to recognize the common species of plants or animals in spite of the fact that 
they may be at different stages of growth. And, of course, we find it quite easy 
to recognize the sex and race of people never seen before. 

Although it is quite popular in current perceptual theory to claim that the 
shape equivalence of such objects may be due to the pickup of common features 
that are easily isolatable from the space in which they occur (i.e., that they are 
context-free features), we will try to show that this is an inadequate theory of 
object perception. The recognition of faces, animal or plants that have aged, or 
the identification of the species, race, or sex of people, animals, or plants 
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requires the pickup of a shape invariant that is a symmetry of a topological 
space. 

After several years of both theoretical and empirical work we have finally 
discovered what may prove to be the most appropriate topological transforma- 
tion to model growth. We call this topological transformation a cardioidal strain. 
The best way to understand what it does and why it is a transformational 
invariant of topological space rather than some more symmetrical space is to 
review the way in which it was discovered. (Later we will provide empirical 
evidence that it is also an informational invariant that is perceptually real.) 


The structural invariant for the growth of dicotyledenous structures. A 
common observation is that most plants follow a growth policy that is sur- 
prisingly symmetrical. Monocotyledenous or bladed plants grow symmetrically 
from opposing sides of a stem such as do grass and weeds. However, a more 
interesting growth pattern is that followed by dicotyledenous vegetables or 
flowers, for example, the kidney bean, pumpkins, apples, turnips, etc. These 
plants tend to grow symmetrically around a nodal point at the end of a stem 
rather than from the sides of the stem itself. 

What makes the growth of dicotyledons interesting is the fact that the human 
head grows in a similar fashion. From the embryonic stage to adulthood the 
profile of the human head, (ignoring facial detail and the mandible), is decidedly 
dicotyledenous in shape, possessing an approximate bilateral symmetry similar 
to that of an inverted valentine heart. Geometrically speaking, the technical term 
for an inverted heart-shaped figure (with a rounded top) is cardioid. 


An experimental test. Using three skull profiles, a 1-year old, a 10-year old 
and a 25-year old, we were able to show that a strain transformation (to be 
defined later) would map one skull continuously into another. That is to say, the 
strain transformation used was at least homeomorphic. We were able to achieve a 
nearly perfect fit of a cardioid when transformed by strain and shear to each 
skull profile for approximately two-thirds of its perimeter, omitting the facial 
mask (Shaw, McIntyre, & Mace, 1974). 

The importance of this finding for understanding the nature of growth proved 
serendipitously inestimable, for from this we were able to infer that the structual 
invariant of growth space might be a cardioidal shape invariant. Just as distance 
is the shape invariant for Euclidean space, the ratio of similitude is for similarity 
space, etc., so we suspected, but did not know at first how to prove, that the 
shape invariant for growth space might be cardioidal. Many years passed from 
this speculation (about 1971) until we were to show that the transformational 
invariant (the homeomorphic strain operation) was indeed a symmetry operation 
of a very special topological space. (Of course, at the time, we did not yet know 
to apply the geometrization approach suggested by Riemann to our problem.) 

Although the next section is rather technical, and may be skipped without 
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losing the thread of our argument, it is important to prove that the cardioidal 
strain transformation, which we believe to provide an accurate geometry for 
growth space, is indeed topological rather than Euclidean, similitudinal, affine, 
or projective. For if we know it is a topological shape invariant, then we have a 
good idea of how abstract and formally complex perceptual information for the 
age level of faces might be. And more generally, we obtain some idea of how 
abstract and thus featureless, in the concrete sense, that perceptual information 
for shape might be. 


Growth as cardioidal strain. D’Arcy Thompson (1917), the great British 
naturalist, suggested three candidates for the nonrigid component of the growth 
transformation: strain, shear, or radial (similarity) transformations. As we will 
try to show later, only the strain transformation is a likely candidate for the 
shape invariant perceived when judging the relative age level of human faces. 

Strain derives from an analysis of the interaction of forces on various types of 
tissues during growth. The strain imposed upon bony tissue by the stresses 
caused by the growth of the highly elastic soft integuments is as Thompson 
argued, “a direct stimulus to growth itself.” Therefore, the growth of the skull 
strains its shape along lines of least resistance dictated by the stress forces 
imposed by the growth of the brain, change in fluid pressures from within the 
cranial cavity, as well as the viscal constraints imposed by epithelia, muscle, 
cartilage, gravitational attraction, and atmospheric pressure from without. For 
all these reasons, strain seems appropriate as a geometry of the growth process 
which remodels the head in accordance with the stress imposed by the biological 
and physical forces mentioned above. Furthermore, by treating these forces 
geometrically, in the manner of Riemann’s global approach, their resultant effect 
can be expressed as a symmetry of growth space. For example, the middle row 
of Fig. 2 (page 121) (where shear level equals zero), illustrates how an arbitrarily 
selected face (the one at zero shear and zero strain) can be homeomorphically 
transformed into other facial profiles at different relative age levels. By inspec- 
tion, it is easy to see that the seven facial transforms depicted in that row are not 
Euclidean equivalent or even similar shapes. To prove, however, that they may 
not be either affine or projectively equivalent forms is not so obvious from 
inspection alone. Consequently, we offer the two following proofs to demon- 


strate this fact. 


Proof 1. A geometrical transformation is merely a function that relates 
geometric objects. Such a relation, y = f (x) or T(x) > y, means there exists two 
sets of objects (points, lines, circles, facial profiles, etc.) such that to each object 
x in one set there corresponds a definite object y in the other. The geometric 
objects that play the role of independent variables are called originals and those 
that play the role of dependent variables are called images for example, T (face; ) 
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> face,, where face, might be a younger vision of face; hence, face, is the 
original and face, the image. Let us also assume that the transformation 7 stands 
for the growth process by which face, is remodeled into face, . (See Fig. 1.) 

We refer to T as a cardioidal strain because when applied to a circle expressed 
in rectangular coordinates it maps the circle through a family of limacons which 
ultimately terminate in a cardioid, (that is, an epicycloid of one loop). Figure 1, 
(row where shear equals zero), indicates part of this series. Beginning with the 
spatial grid at zero strain and zero shear and moving in either direction along the 
row, notice how the cardioidal strain deforms this standard rectangular grid 
continuously into the grids with nonlinear lines. Consequently, a circle expressed 
in the standard grid, under this type of strain transformation, would become 
limacons of various degrees of deformation. Moreover, if the series were ex- 
tended sufficiently far in either direction, then the hypothetical circle would 
ultimately be transformed into a cardioid. 

It is especially important to note that the transformation is applied to the 
whole structure of the two-dimensional space (that is, the standard grid in Fig. 
1), following the global approach suggested by Riemann, rather than being 


Strain 


Shear 


+5 


+15 


FIG.1 Transformations of a two-dimensional Euclidean space by an affine shear trans- 
formation and a topological (cardioidal) strain transformation (standard grid is at shear = 0, 
strain = 0). 


5. PERCEIVING THE FACE OF CHANGE 121 


applied just to the circle itself as an object in that space. The circle, or any other 
object for that matter, placed in the space to which the cardioidal strain is 
applied, will be deformed to the same degree that the whole space is. Con- 
sequently, the facial profiles in Fig. 2 (which were used as stimuli in the 
experiments to be reported) were produced in just this manner; namely, the 
prototypical face (located at the zero-strain and zero-shear position in the array 
depicted in Fig. 2) was transformed (by a computer) into all the other facial 
profiles by applying the two global transformations, strain and shear, to the 
standard space. (More will be said about the role of shear in a later section.) 

Since cardioidal strain is our postulated geometry for growth, it is important 
to see how this transformation works. A simple, one-dimensional affine strain 
has the form of a linear transformation, 


x= poke or x! Shep yay 


where x’, y’ are the new coordinates of the transformed point (x, y) (i.e., any 
point in the standard Cartesian space) and k is a positive integer. However, if we 
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FIG.2 Transformations of a facial profile by shear and strain (untransformed profile is at 
shear = 0, strain = 0). 
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allow the coefficient of strain to be some value other than a positive integer, say 
a trignometric function, then the strain transformation defined is nonaffine. 
Moreover, if the strain coefficient defines a nonlinear change that deforms arcs 
of curves into lines and vice versa, then the resulting transformation is also 
nonprojective. Consequently, we show in what follows that the cardioidal strain 
is neither affine nor projective, but rather must belong to some special form of 
topology. 

The cardioidal strain transformation used in our studies can be stated most 
elegantly in polar coordinates as 


a) sr alata 


By restating Eq. (1) in rectangular coordinates, we see that this is not an affine 
strain, since a given affine strain must act linearly in but one direction, that is it 
is one-dimensional. The translation from polar to rectangular coordinates can be 
made by the equations 


x=rcosd, 
y=rsin®, 
then new coordinates achieved by transforming x and y would be 
x'=r' cos, 
Gas 
y =r sind. 
Now substituting the value of r’ from Eq. (1) yields 


x' =r(1—k sin 0) cos 0’, 
or 
x' =(1—ksin 0)r cos 6’; 
but since 0 = @ and x =r cos 0, then the above equation finally yields 
x =(1—ksin @)x. 
By a similar argument we solve for y’ and get y’ = (1 — k sin @)y. Now let (1— k 
sin 0) =k’, where k is the coefficient of cardioidal strain. We now have 
x! =k'x 
y.=ky, 
which means that the cardioidal strain is neither affine (linear) nor projective. 
The transformation is neither simple nor affine: it is not simple (one-dimen- 
sional) because the coefficient of strain k’ applies to both x and y coordinate 


dimensions; it is not affine since k' is a trignometric function rather being a 
positive integer. 


5. PERCEIVING THE FACEOF CHANGE 123 


Proof 2. To prove that the cardioidal strain is not projective, we need only 
show that it does not preserve collinearity of points but rather takes straight 
lines into curved ones. Although inspection of Fig. 1 is sufficient to show that 
the cardioidal strain does just that, it would be useful to state this fact formally: 
If we take the equation for a cardioid in polar form, that is, r = a(1 — sin 0), and 
let x or y equal a and add our coefficient of strain k, then this yields the 
equation of a family of limacons, that is, r = a(l — k sin 0). When k = 1 the 
equation yields a cardioid, when 1 > k > 1 yields a family of limacons, and 
when k = 0 the equation yields the special case of the circle. This means that the 
cardioidal strain is a strain transformation belonging to a topology rather than to 
affine or projective space. We can now give our final argument regarding the 
nature of the shape invariant preserved by growth. 

Any figure which can be homeomorphically transformed into a circle is called 
a simple closed curve. We have shown that the cardioid is a simple, closed curve 
of best fit for a large portion of the perimeter of a human skull. Although we 
have only worked with the data points of three skull profiles, it was possible to 
transform a normal cardioid to fit each and to strain each of these cardioids into 
one another (Shaw, McIntyre, & Mace, 1974). In our efforts to find the curve of 
best fit for human skull profiles, we serendipitously discovered the mathematical 
formulation for the remodeling of the head profile due to growth, the cardioidal 
strain. 

Using the global geometric approach developed by Riemann, we next showed 
that the cardioidal strain homeomorphically transforms cardioids into circles. 
Thus, to the extent that the cardioid fits the curve of the head’s profile, these 
figures tend to be transformed into circles also by the cardioidal strain. It is a 
theorem of topology that figures that are homeomorphic to a circle are topo- 
logically equivalent figures (Gans, 1969, p. 198). Thus, to a significant extent 
(roughly 240°) the profiles of humans of different ages are topologically equiva- 
lent under the cardioidal strain transformation, that is, they are topologically 
equivalent shapes. 

Hence the cardioidal curve of the human head profile is a symmetry of this 
special topological space, and thus constitutes a shape invariant for our theory of 
growth space. Notice also (Fig. 2) that the facial features approximated by the 
cardioid are also preserved to some extent by the cardioidal strain. These, then 
are also part of the shape invariant for this topology of growth space. 

A strain transformation that was stronger (i.e., preserved fewer symmetries) 
than cardioidal strain might be too elastic and destroy all structural properties of 
the facial profile, say as a sculptor might deform a clay bust of a person into a 
formless lump. Indeed, the weakness of this particular strain transformation is its 
strength as a possible geometry for growth. 

Every growth process has a “yiscal”” component as well as an “elastic” one; the 
viscal component of the transformation preserves the structural invariants that 
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specify the species, race, sex, and individual identity while the elastic component 
expresses the transformational invariant by which the style of change due to 
growth can be distinguished from other types of change (displacements, size 
changes, etc.). Possessing both of these necessary qualities, the cardioidal strain 
transformation is a prime candidate for a geometry of the growth space of 
human faces as seen in profile. 


Evidence for the Proposed Geometry of Growth: 
Perceiving the Age Level of Faces 


Recently we began a series of experiments to test the adequacy of the cardioidal 
strain as a model for the transformational invariant of growth space (Pittenger & 
Shaw, 1975a, b). In addition to the strain transformation, a careful study of the 
remodeling of facial profiles due to growth suggests that an affine transforma- 
tion may also play a significant role, namely, shear. A shear transformation 
modifies the angle of the facial profile (the so-called “Camper” angle) making 
the chin more or less jutting (prognathic) while at the same time controlling the 
slope of the forehead. Clearly, the profile of the human face is sheared as it 
grows, for recall that the mandible grows disproportionately faster than the 
cranial dome (see Fig. 2, column under zero strain). This accounts for the fact 
that adults, as compared to babies, typically do not have such diminutive faces 
nor exaggerated craniums. 

Consequently, we felt it important to determine to what extent the perception 
of the relative age level of faces might be due to shear as well as to strain. If 
degree of affine shear accounted for the perceptual information specifying age 
level, then it rather than the cardioidal strain would be the better geometric 
model for the growth process, at least in so far as the space of human facial 
profiles is concerned. Moreover, if such proved to be the case, then the structural 
invariant of this growth space, by which the recognizable shape of faces is 
determined, would be a symmetry of affine space rather than a symmetry of 
topological space. The following experiments were run to test this possibility as 
well as the adequacy of cardioidal strain in general as a model for growth. 


Comparing strain with shear. It was assumed that a psychological correlate of 
the growth of the human face is its perceived change in relative age level. By the 
use of a computer program, a single facial profile was transformed over seven 
levels of strain and five levels of shear (Fig. 2) in the global manner discussed 
earlier. Since the product of these two transformations is not commutative, 
shear was always applied first. Hence, for example, the profile in the upper 
left-hand corner of Fig. 2 is derived from the profile at (0, 0) by shearing the y 
axis of the two-dimensional coordinate space away from the x axis in a 
counterclockwise direction by 15° and then by straining (that is, stretching) the 
units along the y axis by a coefficient of k = —.25 (where y' = k'y for k' = [1—k 
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sin 0]). In this way, the stimuli to be used in the proposed experiment were 
generated by the coordinate transformation method suggested by Thompson 
(1917) in a manner consistent with the Riemannian concept of space as a 
structure rather than an empty vessel. 

Twenty subjects were then presented with 35 slides of the transformed profiles 
depicted in Fig. 2 and asked to estimate their relative age levels. They did this by 
choosing an arbitrary number to represent the age level of the first profile 
presented and assigning other numbers to represent the age level of the succeed- 
ing profiles that were proportional to the age level estimate of the first. The 
subjects were run in four groups of five subjects each, with each group receiving 
a different random order of presentation. 

Using a Monte Carlo technique of analysis with five thousand runs to estimate 
the distribution we found that 91% of the relative age judgments made by 
subjects agreed with the hypothesis that the strain transformation produced 
monotonic perceived age changes in the standard profile. On the other hand, 
using the shear transformation to predict their judgments produced only 65% 
agreement. Thus, we found as expected that the topological strain transforma- 
tion was by far the stronger of the two variables of age-change and, thereby, was 
the best candidate to represent the remodeling aspect of growth. 


Perceptual sensitivity to strain. A second experiment was designed as a 
preliminary test of the following hypothesis: perceptual information specifying 
the relative age level of individuals plays a major role in the determination of 
certain dominance relationships in both animal and human societies (so-called 
“pecking” orders). Such information is probably also a factor in the selection of 
mates and in helping parents determine the appropriate time to wean offspring 
and encourage their independence. For instance, the length of the beaks of 
young birds apparently provides the stimulus which prompts the mother bird to 
encourage her offspring to leave the nest (Lorenz, 1943). Assuming that percep- 
tual information for age level also plays a significant role in human society, then 
one should expect humans to exhibit a perceptual sensitivity to very small 
changes along this dimension. Moreover, if strain produces changes in faces 
resembling growth, then subjects should also exhibit sensitivity to changes 
produced along this naturally salient dimension. 

Consequently, sensitivity to the shape changes produced by the strain transfor- 
mation was assessed in the second experiment by presenting pairs of profiles 
produced by different levels of the transformation and requiring subjects to 
choose the older profile in each pair. A series of profiles was produced by 
applying strain transformations ranging from k=-.25 to +.55 to a single profile, 
where k is the coefficient of strain used in the equation controlling the computer 
plots. Eighteen pairs of profiles were chosen; three for each of six levels of 
difference in degree of strain. The pairs were presented twice to four groups of 
10 subjects. Different random orders were used for each presentation and each 


126 ROBERT SHAW AND JOHN PITTENGER 


group. Subjects were informed that the study concerned the ability to make fine 
discriminations of age and that for each pair they were to choose the profile 
which appeared to be older. During the experiment they were not informed 
whether or not their responses were correct. By correct response we mean the 
choice of the profile with the larger degree of strain as the older. 

Several results were found: First, an analysis of variance on percentage of 
errors as a function of difference in strain showed a typical psychophysical 
result—a decline in accuracy with smaller physical differences and an increase in 
sensitivity with experience in the task. However, two other aspects of the results 
are more important for the question at hand. First, subjects do not merely 
discriminate the pairs consistently but choose the profile with the larger strain as 
the older profile with greater than chance frequency; in the first presentation the 
larger strain was selected on 83.2% of the trials and in the second, on 89.2% of 
the trials. In each presentation, each of the 40 subjects selected the profile with 
the larger k as older or more than 50% of the trials. In other words, the 
predicted effect was obtained in every subject. A sign test showed the chance 
probability of this last result to be far less than .001. Thus the conclusion of the 
first experiment is confirmed in a different experimental task. 

A second finding was that sensitivity to the variable proved to be surprisingly 
fine. Although no scale yet exists by which perceptual sensitivity to strain can be 
calibrated to absolute age estimates, the results of this experiment suggests that 
such a scale will have to be adjusted to very fine tolerances indeed. Subjects in 
this experiment demonstrated an ability to discriminate shape differences be- 
tween pairs of adjacent profiles produced by strain that were only a few times 
greater than the absolute limit determined for visual acuity in resolving spatially 
adjacent lines (Shlaer, 1937). This is surprising since the figures to be discrim- 
inated were never seen superimposed but always on different halves of the 
screen. Figure 3 provides a dramatic example of how extremely small these 
differences in degree of strain can be and still be reliably discriminated (espe- 
cially note the right-hand figure). 


Perceiving individual identity. Finally, a third experiment was designed to 
investigate the following hypothesis: In spite of the radical remodeling faces 
undergo when they grow, sufficient structural invariants are preserved to permit 
their perceptual identification at different age levels. Consequently, a crucial test 
of strain as a model of the transformational invariant of growth space would be 
to show that it also preserves identity. As a preliminary test of this hypothesis, 
profile views of the external portions of the brain cases of six different skulls 
were traced from x-ray photographs and subjected to five levels of strain. Five 
pairs of transformed profiles were selected from each individual sequence; the 
degree of strain for members of three pairs differed by .30 and those of the 
other two pairs by .45 values of k. A profile of a different skull was assigned to 
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—k=-—23 —k=.09 —k = 37 


--—k=-.03 ---k=.17 m mka41 
Difference = .20 Difference = .08 Difference = ‚04 
Errors (%) Errors (%) Errors (%) 

P1-0 P1-28 P1-40 
P2-0 P2-16 P2- 30 


(P1 is the first presentation; P2 is the second.) 


FIG.3 Sample profile pairs, Experiment 2: (a) errors in presentation 1 = 0%, errors in 
presentation 2 = 0%; (b) errors in presentation 2 = 28%, errors in presentation 2 = 16%; (c) 
errors in presentation 1 = 40%, errors in presentation 2= 30%. 


each of the above pairs which had the same degree of strain as one of the 
members of the pair. Slides were constructed of the profile triples such that the 
two profiles from distinct skulls which had the same level of strain appeared in 
random positions at the bottom. Thirty subjects were presented the slides and 
asked to select which of the two profiles at the bottom of the slide that 
appeared most similar to the profile at the top (Fig. 4). The overall percentage of 
errors was low. For the 30 sets of stimuli presented to 30 subjects, the mean 
error was less than 17%, with no subject making more than 33% errors, Since no 
subject made 50% or more errors, a sign test on the hypothesis of chance 
responding (binomial distribution) by each subject yields a probability of far less 
than .001. Indeed, this finding was corroborated in another set of studies where 
Pittenger and Shaw (in press) also found that people are quite able to rank order 
by age photographs of people taken over nearly a decade of growth from pre- to 
postpuberty years. 

The results of these studies provide support for two important hypotheses: 
The strain transformation, due presumably to growth, not only provides the 
major source of the relevant perceptual information for age level, but also leaves 
invariant sufficient perceptual information for the specification of the individual 
identity of the person by the shape of the head alone. An additional implication 
of these findings is that the information for perceived aging or growth may be 
very abstract indeed. In the next section, we discuss a set of experiments 
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FIG. 4 Sample stimulus Experiment 3. The upper profile, k = .30, is the same individual as 
B, k =-.15, whereas A, k = —.15, is the profile of a different individual. 


designed to help us determine exactly how abstract our model of perceptual 
growth space must be. 


Perceived Aging as the Pickup 
of Abstract Invariants of Growth Space 


Assuming people perceive relative age level by detecting the invariant aspects of 
difference relations between structural states of the face at different stages of 
growth leads to the prediction that the precise nature of the facial features 
supporting the difference relation should prove of little consequence to age 
perception. This is exactly what one finds to be the case. In the experiment 
mentioned above using photographs of faces, we found that adult subjects have 
no trouble at all providing rank-order age judgments of photographs of different 
individuals who are all unfamiliar to the subject (Pittenger & Shaw, 1975a). Here 
since no person’s face appeared twice, there is no possibility that the perceived 
difference relation is between topographical states of the same face. Therefore, 
the information for relative age level must be quite abstract, at least as much so 
as the difference relation people learn to perceive who succeed at the so-called 
“oddity” problem (Gibson, 1969). 
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But exactly how abstract is the perceptual information for the relative age 
level of faces? We suspect it is so abstract as to be essentially independent of the 
absolute features of a single face viewed over time, or of those features common 
to all faces, or even of those features common to all animate objects that grow. 
This assumption suggests the following hypothesis: perceptual information for 
age level is both abstract and global, inhering in the effects that the proposed 
strain transformation has on any object to which it can be applied. In the sense 
discussed earlier, the strain transformation, when applied to a coordinate frame- 
work, provides an abstract model for a dynamic “growth” space. Any object, 
human, animate or otherwise, represented within this growth space will be 
globally remodeled in a manner proportional to the degree of strain applied to 
the total spatial structure (as shown in the first experiment), 

A dramatic test of this hypothesis would be to show that objects other than 
human facial profiles, when subjected to the same experimental treatment as the 
stimuli used in the first experiment, will nevertheless elicit identical rank-order 
age judgments. Indeed, this is exactly what was found in two independent 
studies using computer-generated drawings of dog profiles and Volkswagen cars. 
Following the exact procedure used for generating the human facial profiles in 
the first experiment, it was found that adult subjects had no trouble rank-order- 
ing by age these sets of transformed nonhuman profiles and automobile draw- 
ings. The pattern of results were highly significant and their analysis paralleled 
precisely that reported in the first experiment. 

Although, our subjects may have experienced seeing dogs at different age 
levels, it is not possible that they could have seen automobiles grow as viscal- 
elastic events. Therefore, it seems reasonable to conclude that the subjects were 
able to rank-order the inanimate stimuli because the difference relations per- 
ceived to hold between them specifies a naturally salient style of change, 
namely, growth. 

The combined results of the above experiments testify to the explanatory 
power and empirical fruitfulness of the geometric approach to perceptual phe- 
nomena. Furthermore, this approach offers a plausible interpretation ‘Of II: 
Gibson’s (1966) claim that perception is not based upon the discrimination of 
elementary stimuli (that is, features) but upon the perception of higher order 
forms of invariant information picked up over time. 


CONCLUSIONS 


The arguments and research reviewed above, clearly suggest the need to construe 
the concept of the shape of an object more broadly than traditionally done. A 
large variety of ways exists by which equivalence classes of objects might be 


130 ROBERT SHAW AND JOHN PITTENGER 


defined ranging from rigid objects with the same Euclidean shape, such as rocks, 
cross sections of steel beams, houses, cars, to plants and animals of various sizes 
and stages of growth, whose semielastic shapes still retain sufficient, topological 
structure to allow classification into the same species, race, sex, family or 
identity category. For these reasons, it seems wiser and more useful for percep- 
tual theory to hold a liberal view of the concept of shape, since under different 
circumstances objects may be judged perceptually equivalent depending upon 
whether they share congruent shapes, similar shapes, affine-equivalent shapes, 
projectively—equivalent shapes, or topologically equivalent shapes. 

Furthermore, it should be noted that this list of differential geometric shapes 
is by no means complete; the perceptual information by which subjects may 
judge other collections of objects to be perceptually equivalent may require 
discovery of the appropriate shape invariant as defined in still some other 
geometry than those surveyed here, say in equiaffine or conformal geometries 
where areas or angles are preserved, respectively. Of course, exactly which 
geometric definition of shape is most appropriate to a given class of experimen- 
tal stimuli is not an a priori affair, but must be discovered empirically, much in 
the same way that the best distance metric must be discovered for scaling various 
kinds of data. There is an important difference which should be emphasized, 
however, namely, that the concepts of the shape invariant provided by affine 
geometry on down on the hierarchy of geometries (for example, projective 
geometry or the topologies) are nonmetric in character. This means that, roughly 
speaking, the concept of shape as specified in these weaker geometries are based 
on qualitative rather than quantitative properties of objects. 

The most important implication of the reported studies for perceptual theory 
is the following: The degree of abstractness manifested by the topologically 
equivalent shapes produced by growth suggests that the relevant perceptual 
information by which they are judged equivalent must be equally as abstract. 
Moreover, recall that a geometry is nothing less than a theory of the structure 
(symmetries) of the space, or class, of shape-equivalent objects; this suggests 
that, in order for the perceptual judgments to be made, the perceptual space of 
subjects across such a wide range of tasks must be sufficiently flexible to take on 
the structural properties of the space of the shape-equivalent objects used as 
stimuli, otherwise no interjudge reliability would be found to exist across 
subjects. Recall that in the tasks reported above, the effect was found to exist, 
not just in the group, but within each subject who participated. 

For all the above reasons, there does indeed seem to be some justification for 
holding onto the working hypothesis posed earlier; namely that not only is the 
perceived shape of an object due to the structure of the space (equivalence class) 
to which it belongs, but that the very structure of perceptual space itself is 
malleable, being dependent upon the structure of the objects perceived. If so, 
then not only does the proposed geometric approach coalesce the problems of 
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object perception and layout (“space” or “depth”) perception so that they 
might be treated in a unified way, but it also suggests how adaptation effects 
might be explained under the same geometric theory of perceptual space. 
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The study of visual and optical qualities of paintings, especially abstract paint- 
ings, is a subject of major interest to the perennial attempt to provide a scientific 
basis for art criticism. What needs to be shown is that the effects of paintings are 
causal in nature. If so, then we might expect significant relations to hold 
between factual knowledge about vision and art criticism. Further, we might also 
expect critical explanations and characterizations of such effects to be verifiable 
where most critical claims are not. Hopefully, we might shed some light on the 
subject of justification of a limited type of critical claim, a claim about the 
effects of a painting. The two questions to be explored then are, first, the 
question of justification of critical explanations or characterizations of the 
effects of abstract paintings and, second, the question of the relevance of factual 
knowledge about vision to these claims. 

As a sample characterization of an effect of an abstract painting, we might 
take the following: “The shallow, stage-like space in Jackson Pollock’s ‘Autumn 
Rhythm’ is reminiscent of late cubism.” In contrast, as a sample explanation of 
the same effect, we might take: “The spatial effect is due to the fact that the 
black skeins become figural with respect to the warm background. The skeins 
tend to assume a position in front of the picture plane. Further, the ground has a 
somewhat atmospheric quality and thus tends to recede as do objects enveloped 
in haze, mist, and other atmospheric phenomena. Finally, the texture of the 
black skeins tends to advance over the less textured warm ground.” As we shall 
see, the study of perception has discovered many facts and principles relevant to 
the critical evaluation of such claims about the effects of paintings. 

With respect to the above questions, I will defend two theses and attempt to 
develop a theory of sensibility, that is, a theory of how people develop that 
mode of visual attention required to sense various effects available in a painting. 
The first thesis asserts that specific critical claims should not be considered apart 
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from generalizable acts of sensing, such as the act of sensing direction, mass or 
balance in a painting. Thus, a critical claim about a painting, a characterization 
or explanation of an effect, cannot be understood in isolation from a developed 
sensibility to the mode of perceptual information addressed by the critical claim. 

At first sight, this thesis regarding the role of sensibility does not seem 
different from a traditional concept of aesthetic taste. However, this thesis is 
new in emphasizing that sensibility is in part due to the refinement of perceptual 
abilities (perceptual refinements). In this respect it is not at all like a traditional 
concept of taste. Moreover, as we shall see, the role of sensibility in the 
philosophy of mind is considerably different from that usually attributed to a 
mental faculty required for aesthetic judgment. 

The second major thesis of the chapter is that factual knowledge plays no 
necessary role in the confirmation of specific critical claims; rather it is more 
relevant to the question of whether general acts of sensing have any foundation 
in ordinary perceptual abilities (or what we shall call normalcy). For example, 
knowledge of correlations, such as “the effects of space due to overlapping 
forms, texture gradients, color and atmospheric quality,” can not be used to 
substatiate or refute a specific claim, such as “that painting by Pollock has great 
depth.” 

Factual knowledge, such as knowing what perceptual cues correlate with what 
effects, is relevant to understanding how people develop the ability to sense such 
effects in a class of paintings. This is analogous to the case where a natural law 
can be used to explain the occurrence of a class of events but cannot be 
demonstrated with certainty to apply in any specific case. In a similar vein, a 
psychologist may demonstrate that an organism has a habitual pattern of 
behavior but can not claim with certitude that a given response is due to the 
existing habit. Consequently, what is needed to support the two theses is a 
theory of how sensibilities to certain classes of effects may be developed. 


MEANING OF EFFECTS OF ABSTRACT PAINTINGS 


Effects, Suggestions, and Evocations 


Effects of abstract paintings can be distinguished from their suggestions and 
evocations. What an abstract painting suggests is whatever ideas it calls to mind. 
This is more ideational than what the painting evokes, which refers to the 
emotive tones of the work. Effects, on the other hand, refer to the visual and 
optical qualities of a painting, such as space, light, motion, and its dynamic 
aspect. 

A complete analysis of the content of the abstract expressionistic paintings dis- 
cussed cannot be fully understood without distinguishing their suggestions and 
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evocations from their effects. This is often difficult in practice. For instance, the 
same blob of paint that suggests blood may be a cue to space. Thus, it is often 
the case that a thing may both specify space and lend an emotive tone to that 
space. 

For the above reasons, when analyzing paintings it is absolutely necessary that 
effects should be distinguished from their suggestions and evocations. Con- 
sequently, in this chapter I shall develop a theory of sensibility only to effects. 
Thus, it must be regarded as only a partial theory of sensibility since a full 
theory would discuss sensibility to suggestions and evocations as well. 

Because I distinguish effects from recognizable images, a primary motivation 
for picking pure abstracts, rather than representational paintings, is to examine 
the visual qualities such works can create without involving acts of identification 
and recognition. For instance, Holbein’s “Erasmus of Rotterdam” hardly just 
gives an effect of Erasmus but involves the viewer in recognizing and identifying 
many familiar forms as well. 


Four Kinds of Effects 


At least four major types of effects can be noted: First, and the most clearly 
causal effects typically exploited in abstract paintings, are those which work 
solely for physiological reasons, (color contrast induced by complements being 
an example). A second, and very different range of effects are those arising from 
composition. Compositional effects can guide vision, such as when colors cap- 
ture the eye or lines lead the eye. Notice that these effects describe what 
happens to the eye in viewing rather than a phenomenal experience of the viewer 
(space, for example). Thus, a shift in idioms is required from “gives the effects 
of ” to “has the effect of > Both learned scanning habits and Gestalt 
laws of grouping are presumably relevant to compositional effects. 

A third type of effect paintings may have on viewers is due to cross-sensory 
effects. These arise when a visual cue excites an effect that is not purely visual as 
in “the forms give the impression of being massive” or “the slanting of top-heavy 
shapes give the impression of impending collapse.” (Space, of course, is also a 
cross-sensory effect). 

A final important kind of effect are the many cues sufficient to create an 
experience of three-dimensional space. For instance, figure-ground relations 
create an experience of space even when the figures are extremely abstract and 
unfamiliar. Similarly, a depth effect can be created by the fact that “cool” 
colors (blue and green) tend to recede in paintings while “warm” colors (red and 
yellow) tend to come forward. Moreover, a depth effect can be created by 
atmosphere, that is, the fact that distant objects appear cooler in color than 
nearer ones, loss of focusable detail with distance, etc., or by the overlapping of 


forms (occlusion) or the overlapping of colors (transparency). 
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Such cues as these are so basic to visual perception as to constitute informa- 
tion for the form or structure of the visual world per se and not just for its 
content. Indeed, space seems to constitute a general category of vision. For this 
reason, perceptual cues specifying three dimensionality are more basic to visual 
perception than to any particular aspect of the world or to any specific type of 
object. 

The claim therefore to be explored in subsequent sections of the chapter can 
be stated as follows: abstract paintings create phenomenal effects by setting into 
operation the categorical aspects of vision rather than by merely eliciting our 
knowlege of specific particulars and types. In other words, abstract paintings 
succeed because they exploit what Kant referred to as the “forms of sensible 
intuition.” 

These four kinds of effects, physiological reactions, composition, cross-sensory 
effects, and spatial cues, have only briefly been considered since they are so 
well-known and may be readily reviewed in detail in many books on perception 
(e.g., Gibson, 1966), or on the psychology of art (e.g., Arnheim, 1954). It should 
be emphasized, however, that there are many kinds of factors involved in these 
effects. There are the nature of the eye, habits of scanning, the disposition and 
nature of the eye as seeing figure and ground, general knowledge of atmosphere 
and texture in space, and general knowledge of cross-sensory correlations. Various 
combinations of these factors are set in operation by visual causes in 
the abstract and create phenomenal effects. By “visual cause” I mean any 
stimulus in a context (such as overlapping or placing a color on another) which 
creates a phenomenal effect. “Phenomenal effect” covers the full range of 
qualities such as space, light, and motion. “Causal correlation” means the 
correlation, based on the nature of seeing, between visual causes and effects. 
This concept is important to the problem discussed next. 


The Problem of Contextualism 


A major problem that arises in viewing art is that of contextualism. Even though 
the causal correlations cited are well-known and work in decontextualized 
illusion illustrations (figures with no context), the same correlations might not 
work in a rich context. Suppose the Müller-Lyer illusion had parallel lines drawn 
close to and at right angles to the illusory lengths. At certain distances from the 
illusory lengths, the parallel lines would clearly tend to cancel the effect because 
they would act as conflicting cues to length. Contextualism in painting is 
somewhat analogous, there being both reinforcement and conflict among various 
cues. Suppose, for example, there is a conflict between the plane a color tends to 
assume and overlapping of planes as in many abstract paintings by Hans Hof- 
mann. The richness of context destroys the agreement we can get from contrived 
examples on psychological tests. The complexity of contextualism becomes even 
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richer when it includes qualities (some of which are not effects), suggestions, and 
evocations. For example, the emotive tone of color schemes may affect the 
relative depth of the planes to which they are applied. 

If an ideal subject viewed a given painting, presumably all these variables 
would sort themselves out into dominant and subordinate effects. But with an 
actual subject, even with the most powerful devices to draw attention, there is 
still considerable variability in the manner of viewing. Viewer distance and the 
order in which an eye will “read” parts of the painting cannot be well controlled 
for paintings whereas such factors can be controlled in psychological experi- 
ments with illusory figures. 

Finally, there are at least two very different kinds of contextualism, a contex- 
tualism of effects consists of a lumping together of suggestions, elusive qualities, 
and evocations in a hierarchy of dominant and subordinate effects; however, 
there remains a second kind of contextualism—the contextualism of purposes. 
Dominant effect is not the same as the main purpose. The subordination of 
means to ends is a different kind of ordering from the relation of dominant and 
subordinate effects. A sensitivity to the main purpose of a work is different from 
a sensitivity to its dominant effect, though sensing of dominant effects is usually 
a means to a purpose and sometimes the main purpose itself. 

In my opinion, however, contextualism even in these rich painting contexts is 
overdone, and there are two important kinds of qualificiations. First, there is 
reason to believe that there is a hierarchy of effects which can be established 
experimentally due to unequal weighting of factors in the background of 
normalcy required for ordinary perception. In the example mentioned, overlap- 
ping will usually override the more subtle indication of space by color. Form 
perception therefore will usually override a more subtle spatial cue. Of course, a 
large enough area of color with a very strong spatial tendency might override 
even Gestalt indications and tendencies. I suspect one could even determine 
experimentally how much area tended to override a more basic factor such as 
overlapping. At present, however, only the second answer to contextualism will 
be developed. 

The answer (or a new form of contextualism if you prefer) is that attention is 
a process which inherently decontextualizes and recontextualizes by selecting 
certain features of paintings while ignoring others. For this reason, certain 
learned modes of attention and sensibility clearly eliminate much of the problem 
of variability mentioned earlier and, thereby, aid the painting in realizing its 
phenomenal fullness. 

The same kind of factors which operate in normal vision and are demonstrable 
in illustrations of illusions also operate in painting contexts. For example, the 
appearance of known correlations (such as texture gradients creating space) in 
critical explanations of what the effect of space is due to, offers some support 
for this claim. However, a scientific explanation of vision is not relevant to 
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criticism at the level of specific disputes over whether or not a painting gives a 
certain effect; but rather it is relevant at the level of characterizing a kind of 
sensibility to general visual categories. 

If analytical attention to visual causes, in certain modes, is not to be a fraud, 
then it must be based upon an understanding of normal vision. Scientific 
exploration justifies general acts of sensing effects over classes of paintings, at 
least in the sense of providing a characterization of real visual possibilities. 
Looking at a painting in terms of texture, overlapping, color planes, and 
atmospheres will give a sense of space; moreover, the causes of such effects can 
be theoretically determined. 

Factual knowledge of causal relations is relevant at the level of potentialities of 
the medium as viewed, not at the level of justifying some claim about a 
particular painting effect. One wonders however if there is not enough historical 
knowledge of art to tell us about known correlations, and that no causal theories 
are needed. Does art history tell us all we need to know about the potentialities 
of the medium? Perhaps not, for art history tells us about purported effects, but 
not whether purported effects have any foundation in normalcy. There remains 
the question of whether a painting works visually. 

I will return to the subject of causal correlations and contextualism towards 
the end of the chapter after developing the necessary theoretical background to 
deal with this thorny problem. We will now take what appears to be a large 
digression into the subject of attention and sensibility. An adequate theory of 
sensibility, however, is absolutely necessary before the problem posed by tradi- 
tional contextualism can be solved. 


ACTS OF VISUAL ATTENTION 


Introduction 


Visual attention will be understood here to mean a channeling of visual aware- 
ness mostly, but not entirely, by verbal means, in which the object of attention 
through characterizations or descriptions induces an “uptake” in acts of noticing. 
I will try to show that visual attention to something like color, when skilled and 
complete, is a redundant complex of verbal categories, spatial reference aids, 
techniques, and devices for focusing attention, and techniques for controlling 
the eyes. These redundant elements of visual attention can be ordered into 
objectside, intermediate, and eyeside elements of attention. “Objectside” refers 
to the particular physical object, specifically an abstract painting, as described or 
characterized, the verbal description or characterization being essential to define 
the object of attention. Attention to an object thus will always be conditioned 
by certain circumstances, though it must be recognized that attention can be 
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channeled by nonverbal means, as in animals. Nonverbal means of directing 
attention, say by pointing, constitute what is meant by the intermediate element 
of attention. However, attention to a very complex visual object such as the 
effects given by an abstract painting is certainly learned by verbal means, with 
aid from nonverbal devices and techniques. In the section called “Sensibility and 
Noticing,” I shall show that the actual physical object provides various sorts of 
uptake within the channels of attention, and thus, in a sense, the actual object of 
attention itself fills in the final gap between the verbal description or character- 
ization and the actual seeing of the object in those terms. For now, “objectside” 
means the particular aspect of a painting as characterized or described. On the 
eyeside are actual eye directions such as “blur the eyes to observe color.” 
“Eyeside” refers to eye directions such as focus, scan, blur, or squint the eyes. 
Also “eyeside” is used loosely to cover headside or bodyside directions such as 
“if you want to look just at the color scheme, then blur your eyes or stand back 
so far that you cannot recognize the objects.” For convenience I shall refer to 
objectside, intermediate, and eyeside aspects of visual skills such as attending to 
color. 

By “acts of visual attention” I mean that some modes of seeing are delib- 
erately learned, often involving practiced eye techniques, devices, types of 
attention, eventuating in a visual skill in a fairly literal sense. During the process 
of learning the lessons may be assimilated into a smooth and practiced visual act, 
like other skilled acts, with the subsequent development of what I call percep- 
tual refinement abilities—abilities such as discrimination, sensitivity, and acuity.’ 
To clarify what is meant by perceptual refinement abilities consider the follow- 
ing case: The astronomer practices averted vision, the holding of the fovea 
several degrees off of the object of attention, so as to select the rod-laden 
portion of the retina that is more sensitive to light. One’s sensitivity can often be 
extended by more than a magnitude by this eye technique, and it is amazing 
how much more detail of dim celestial objects an astronomer can see as 
compared to an untrained observer of the same visual ability. Analogously, one 
can practice modes of attention appropriate to abstract paintings, and with 
practice the act becomes smooth and efficient with the development of percep- 
tual refinement abilities which I will call sensibility. 

Of course, there are also important differences between a visual skill like 
averted vision and sensitively perceiving an abstract painting. These differences 
can be emphasized by pointing out that one does not look at aspects and 
features of nature in the same ways as at the elements of art; one does not look 
at color in the same ways as at a color scheme. Nonetheless, there is full 


“We can now suppose that the perceptual systems develop perceptual skills, with some 


analogy to the way in which the behavioral systems develop performatory skills” (Gibson, 
1968). 
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justification for calling sensitive perception an act of vision in the full sense of an 
intentional and purposive activity for sensing impressions and effects of paint- 
ings. Consequently, sensitive perception is a deliberately cultivated act. 

Also “acts of visual attention” means a visual act of some generality. There are 
no skills of perceptual refinement abilities uniquely derived from nor applicable 
to just one painting but only for a class of paintings considered in a certain way. 
This is just to say, as emphasized below, that with respect to sensibility the 
object of attention is an element of art, and elements of art must be embodied in 
at least several works. Critical characterizations and explanations often suggest a 
general mode of sensing. (There are exceptions, as when a critic characterizes or 
explains the peculiarities of an anomalous work.) The critic usually means, and 
the context will show when this is so, that the terms of the explanation or 
characterization can be applied to a class of works of some extension—say other 
works by the painter in the same period, the whole corpus of works by a painter, 
works of the same school, works of the same period, etc. 

Sensibilities, then, are visual acts of kinds or types in the sense of being about 
a class of some extension and in being specified in terms of elements of the 
works of this class. Emphasis on this sort of generality is necessary in order to 
emphasize acts of sensing, acts so general as to permit use of the concepts of 
skill, habits, practices of viewing. Consequently, all my remarks are about visual 
acts of some generality, though the visual acts may involve odd and hyphenated 
modes of attention such as color-scheme-considered-from-an-expressionistic 
point of view. By “acts of attention” then is meant habitual attention in certain 
terms to an element of art in a class of some extension. 


Viewing particular paintings. Given that the perceptual refinement abilities 
upon which sensibility rests must be developed from viewings of a wide class of 
paintings, the problem of how we attain a stable viewing of a particular painting 
remains. Although acts of attention have the generality noted above, strictly 
speaking, an act of attention is a complete act only when there is an uptake of 
noticing within the terms of what is being attended to. Consequently, the 
generality of the visual act is the terms in which one sees, while the uptake of 
noticing is a particular realization within those terms. Thus, attention when 
completed by a successful uptake, (i.e., a “Eureka!” experience), is like any act 
in being a particular occurrence; it either happens or it does not. 

Our particular viewings without a developed sensibility have no more stability 
and intelligibility than would our particular acts without the stabilities of 
character traits, habits, moods, etc. The complex activities involved when view- 
ing a particular painting includes many things—habitual modes of attention, 
memories, images of other paintings, verbal descriptions, titles; moreover, for a 
given person these categories are more or less stable. When less stable, every 
viewing is different from other viewings. When these stabilize with practice they 
provide a consistent basis in terms of which one views paintings, though of 
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course every viewing is also a unique realization of uptake. There may be many 
kinds of occurrences or kinds of uptake. A painting may appear to be definite or 
unique (style) or of a particular type; an effect may appear to result from a 
certain cause, or an image may appear to be animated with some suggestion or 
evocation. On a particular viewing a Pollock painting can have a very definite 
space, say that of late cubism, look as though the space were due to overlapping 
strands, look as though the drip gestures are intense. (Kinds of uptake are 
further discussed below.) We can notice all these things on an occasion, given 
visual skills and sensibilities. There are no doubt many things that affect a 
particular viewing other than sensibility, but sensibility is the stabilizing center. 
The situation is analogous to interpreting a person’s acts, which, as particular 
occurrences, can only be understood in terms of taking place in a background of 
more stable things such as habits, dispositions, character traits, moods, and 
specific intentions. 

To anticipate a later argument, our “verification” of the critic’s explanations 
and characterizations should not be isolated from our own sensibility and the 
state of our perceptual refinement abilities with regard to the relevant mode of 
sensing a particular painting. Our confidence in the correctness or incorrectness 
of a critical claim should be directly proportional to our awareness of our own 
sensibility in the relevant mode of attention. Before we have the relevant 
sensibility we cannot distinguish between a projection of the explanation or 
characterization onto the painting from the sensing of the painting as ex- 
plained or described. 


Objectside (Verbal) Aspects of Attention 


Generally attention is related to ostensive defining. Many techniques such as 
pointing or directing one’s glance can also be used to draw or direct attention. 
But knowing what someone means is not simply attending to that which they 
refer. The same sort of category mastery required to understand ostensive 
defining is also necessary to understand attention. Attention, however, may use 
additional means, even devices and techniques, because it places emphasis on 
perceiving the object in certain terms, that is in some specific way rather than 
just arbitrarily. 

The most relevant similarity between referring and attending is pointed out by 
Wittgenstein in his well-known arguments about ostensive defining. A child does 
not learn words by ostensive defining (though ostensive teaching can be in- 
volved) because such defining already presupposes mastery of basic categories 
and behavioral aspects of such pointing. If I say “that is sepia” the “that” can 
mean an indefinite number of things. Only if the learner knows the category of 
color can we ostensively define, for we can then say “sepia is a color, and that is 
sepia.” Before a child knows the basic categories necessary to understand 
ostensive defining, we use only trial and error (ostensive teaching), with only the 
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teacher knowing the categories. Ostensive defining requires a vast amount of 
stage-setting or category mastery. 

Analogously, attention, at least attention to anything at all complex, involves 
channeling of perceptual awareness by verbal categories. To reiterate: Attention 
is always in certain terms. Once the verbal lessons have been learned, attention 
has its autonomy, and certainly we often attend without consciously using 
verbal categories. We can honestly say “yes, I noticed the color” without 
meaning we thought about the color. 

Mastery of the concept of color requires, of course, more than just the use of 
the word “color.” A full understanding of how to attend to color can manifest 
itself in other verbal and nonverbal devices, in behavioral gestures, in techniques, 
and even in what one does with his eyes. Since attention to color is a matter of 
visual attention, there is more involved in mastery of attention to color than just 
referring or ostensively defining color, as noted above. Attention to color is a 
visual skill, which can clearly be refined to increase discriminative ability. The 
verbal and nonverbal devices, behavioral gestures, techniques, and eye directions 
go beyond what is necessary to ostensively define color, though it will be seen 
there is some overlap. For example, a child must master several verbal devices if 
he (or she) is to use concepts such as color; these devices we call part-whole 
reference aids, Such aids consist of verbal expressions or characterizations which 
draw the viewer’s attention to parts of wholes or to the wholes themselves. 
“Part” is a covering term for any number of spatial units—figments, fragments, 
parts, pieces, segments, areas, planes, regions, blended areas, brushstrokes, blobs, 
drips, etc. These combine into very different kinds of wholes. Of course, we 
usually combine part-whole reference with aspection, that is, we look at areas of 
color, sections of line, rays of light where each of the supplemental reference 
aids (area, section, ray) indicate the kind of part unit appropriate to the aspect. 
In fact, aspects are just certain kinds of part-wholes. Color is one kind of 
part-whole, texture another. In general, a child who constantly referred to a 
piece of color, or who did not show he knew that a line occurred in sections and 
sweeps, would not be said to understand the concept of color or line, respec- 
tively. 


Intermediate Aspects of Attention 


Pointing is one kind of intermediate and nonverbal aid to attention involved in 
the mastery of concepts such as color. Some kinds of pointing are even relevant 
to aspect terms and reference aids. For instance, a generalized gesture to a vaguely 
bounded area is appropriate to an area of color, while a sweeping gesture 
pointing to a sweep of line can capture behaviorally the nature and tempo of the 
sweep of line. A good museum lecturer does not point to color, line or texture in 
the same way. One does not point to pieces, sections and areas in the same way. 
A child who referred to a piece of color would possibly also make inappropriate 
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gestures in ostensively defining color for others. There is of course redundancy 
in spatial reference aids and gestures. Also intermediate are techniques such as 
cupping one’s hands so as to isolate a color from its context and from aspects 
such as shape or outline. Of course, there is redundancy between cupping one’s 
hands, spatial aids such as “area,” behavioral gestures and aspection terms such 
as “color.” Also cupping one’s hands is a good illustration of the difference 
between reference and ostensive defining and visual attention, for cupping one’s 
hands goes beyond the requirements for clear reference and clearly is a part of 
the visual skill of attending to color. Another example of an intermediate 
element in mastery of the skill of looking at color are devices such as a piece of 
paper with a hole in it. Such devices, of course, do something very similar, 
sometimes with some advantages, to cupping one’s hands. 


Eyeside Aspects of Attention 


“Completely on the eyeside are such techniques as blurring one’s eyes or standing 

back so far one cannot recognize the subject matter. Both are good eyeside 
techniques, often advised by critics and artists, for looking at areas of color in 
color schemes. In summary, from the objectside to the eyeside there is redun- 
dant and overlapping verbal equipment, behavioral gestures, techniques, devices 
and eye techniques which are employed in skilled attention to color. Any one 
element in this visual skill can be omitted or even not known since they are 
redundant, whereas lack of many of them lessens the visual skill and makes one 
more reluctant to claim mastery of the skill of attending to color. There are 
analogous redundancies in ordinary reference and ostensive defining. 


Color viewing. Mastery of attention to the category of color involves viewing 
techniques ranging from the eyeside to the objectside such as blurring the eyes, 
using devices which emphasize color, using proper reference aids such as “area” 
or “region,” looking in correct part-whole terms, attending to the color of the 
object of attention. A person may develop differential sensitivity by these 
various techniques. Thus, some methods of attending to color differ radically 
from those used to attend to texture, shape, or outline. This is due to the fact 
that color is an aspect of objects which is visually pervasive in ways that texture, 
shape and outline are not. To see what this means, select a painting, first, blur 
the eyes so as to destroy visual texture then mask the outline with a funnel of 
paper, finally, step back too far to recognize the subject matter. Clearly, each of 
these techniques possesses subtleties of its own and thus makes it quite different 
from the others as a method of attending to color. 

Thus, there is a cluster of such techniques from the eyeside to the objectside. 
These techniques range from manipulation of the eyes or their augmentation by 
devices, through nonverbal reference aids, such as hand gestures, to verbal 
suggestions indicating where one should look in a painting in order to notice 
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certain significant aspects. It is important to notice that the use of the aspect 
term “color” exhibits a category consistency, in that it refers to what remains 
perceptually invariant across all the viewing techniques mentioned, from object- 
side to eyeside. 

Attention to abstract paintings sometimes requires category abnormalities such 
as looking at the shape of holes in some of DuBuffet’s paintings and Moore’s 
sculptures. Very often, in fact, painters want to change our reference equipment 
or at least refine it. Notice Ben Shahn’s (1967) book entitled The Shape of 
Content. Such category changes or crossings affect attention on the eyeside and 
objectside. 

The direct approach of characterizing the object in the painting world, the one 
used by most critics, seems much less artificial than emphasizing eyeside modes 
of attention. One can, of course, usually derive the modes of attention from the 
characterization of the object, though not conversely. This sort of free shifting 
and interchangeability of modes of attention and objects meant is typical of 
ordinary reference too. Even though critics do not usually give eyeside direc- 
tions, one can usually derive the eyeside modes of attention from the charac- 
terization of objects as given. In addition one can also often derive further 
details about analytical techniques of looking and sensing. Sometimes these 
remedial lessons are necessary to develop our sensing. “Blur the eyes” is 
sometimes a help in looking at color schemes. Also, explaining what an effect is 
due to is remedial, but sometimes indispensable. In sum, attention is a complex 
visual skill involving eyeside techniques appropriate to objectside categories. 


Attention to Abstract Paintings 


Attention to abstract paintings has certain perculiarities due to the object’s 
phenomenal nature and to the nature of eyeside modes of attention. Because 
one often encounters entirely new types of wholes in abstract painting, ordinary 
aspection and part-whole reference sometimes prove inadequate. From the 
discussion of contextualism there seems to be two kinds of wholes: natural 
phenomenal wholes (Gestalts) involving principles of groupings, factors from the 
background of normalcy (hierarchy of effects), as opposed to purposive wholes 
where that aspect of the painting results from the artist’s successful fulfillment 
of some goal. In the latter case, the proper unit of analysis is the “means-end” 
strategy. For example, the artist may wish to convey a certain mood in order to 
dramatize a social message inherent in the content of the painting. To do so he 
may utilize the contrast effect of certain complementary colors. However, the 
part-whole relationship of the colors to the total color scheme used is not an 
appropriate unit for a means-end analysis of the artist’s intention. Correctly 
selecting and executing a color scheme, like composition, is an element of 
painting technique which is itself a subart that must be mastered by the artist. 
The desired end, therefore, is often achieved via some set of phenomenal effects. 
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Another way of stating how ordinary aspection, part-whole reference and 
elements of painting differ is to say color schemes are not the same as color, nor 
line the same as outline or natural shape. “Look at the color scheme” involves a 
different kind of looking from “look at the color.” “Notice the drawing” has a 
distinct visual feel, a sort of overall looking at contours of line edges, and no 
ordinary looking is like it (differences we shall return to shortly). Sensibility has 
to be to the elements of painting, and elements often include phenomenal 
effects. In sum, ordinary visual attention is in terms of spacetime and aspection 
reference. Eliminating time, looking at paintings is roughly looking at spatial 
parts and wholes and aspecting (e.g., denoting values of colors, shade, texture, 
etc.). Ordinary aspecting and ordinary part-whole are not completely adequate, 
however, for we must learn to see in terms of the elements of painting, in terms 
of color schemes instead of just areas of color. The contrast between elements of 
art and ordinary aspection will be the basic difference between sensibility and a 
visual act such as averted vision. 


Peculiarities of the object of attention when viewing abstracts. Peculiarities in 
the object of attention when it is an effect of an abstract painting quickly 
become obvious. These peculiarities revolve around the fact that we attend to 
visual causes in order to sense effects such as space. Ordinarily, we learn visual 
techniques to discount misleading visual cues. For example, the skilled judge of 
distance on a golf course learns to discount valleys and emphasize texture and 
size constancy of the flag on the green. We attend so as to overcome misleading 
cues, 

In abstract painting, the converse is in a sense true, We emphasize the cues and 
allow them to create a full world of phenomenal effects. Our knowledge of 
causal correlations is, in fact, an important part of our aesthetic education, and 
the particular ones cited here have been known at least since Leonardo. The 
nature of our viewing is simply the viewer counterpart to the fact that space and 
other effects have long been a part of the painter’s purposes. We must then learn 
to decontextualize and recontextualize along entirely different lines than in 
ordinary vision. The good judge of distance decontextualizes from valleys and 
recenters on textures. In a painting, we decontextualize from a flat wall and 
from surface texture of paint, and recontextualize on cues that give an illusion 
of space. We know that painters often exploit cues to create phenomenal worlds, 
and we use our knowledge of causal correlations to center on these cues in order 
to savor the size, scale, nature of, say, picture space. Of course, our recontextu- 
alizing, though in a sense odd, is itself based in normalcy in recentering on visual 
cues which cause effects because of normalcy. 4 i, 

Another way of putting the same oddity of the object of attention when it is 
effects is to say the object of attention is a merely phenomenal world. (Of 
course, we also attend to literal qualities such as paint texture.) Both the 
structure of the painting world spatially and the qualities of the “things” within 
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the world are purely phenomenal. One cannot get into these worlds to explore 
around. Space, motion, and tactile-kinesthetic qualities are in a painting only 
visually, phenomenally, and they can never be checked cross modally. This 
means that ordinary sensibilities (such as a sense of distance, sense of mass, 
senses of various tactile—kinesthetic qualities) are activated, but cut off from 
usual pedestrian modes of confirmation. (For suggestiveness and evocation we 
employ still other ordinary sensibilities such as a sense of danger, sensing a 
threat, sensing a mood. These probably involve sensing subtle features of 
behavior and gesture.) Ordinary sensibilities are confirmed and developed by 
ordinary mundane and nonintuitive procedures. Using ordinary vision to gather 
tactile-kinesthetic information such as how something would feel, how stable 
kinesthetically it is, can be checked. The apparent distance, depth and scale of 
space can be checked. But one cannot explore around in the painting world. It is 
there only visually, in fact, even more tenuously, it is there only to our 
sensibility. Not only can we check if our ordinary sensing is accurate, but we can 
cast our attention to a wider context in places we know we are likely to be 
fooled, and we learn which sort of sensing to practice to override misleading 
cues. A good judge of distance has to learn to discount and emphasize, just as 
does a good judge of weight or mass. 

A way of generalizing this point about oddities of the object of attention when 
the object is an effect is to say that qualities in a painting become special 
sensibles of vision even when those qualities are not ordinarily special sensibles. 
Space and motion are not ordinarily special sensibles, for other senses can 
confirm them, But a painting only looks spatial (in the volumetric sense), and if 
it looks spatial then it is phenomenally spatial so far as critics and viewers go. In 
fact, we must say somewhat more weakly that a painting only gives a sense or 
impression of space. The space exists only to our sensibility, thus I prefer the 
idioms “gives a sense of” and “gives an impression of” to concepts with such 
ordinary epistemological baggage as “looks,” “appears,” and “seems.” 

Ordinarily we are interested in going from how things look to how they really 
are. In a painting, we are interested in how paintings really look, or more 
accurately what effects and impressions they give to trained sensibility. Thus 
there is much greater dependence on descriptions of others, though of course we 
want eventually to develop integrity in our own sensibility. Critics, of course, 
freely use “looks,” “appears,” and “seems” in discussing effects and impressions, 
but they assume that the context will make clear an implicit qualification “looks 
spatial in the way paintings look spatial.” They assume we will take their 
remarks as being about sensibles that are special to vision. 


Eyeside peculiarities. If we switch from the objectside (phenomenal world of 
painting) to the eyeside matter of modes of attention, we discover some 
distinctive modes of attention, or at least we discover some ordinary modes 
being strangely employed. In this section, it is necessary that I be somewhat 
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autobiographical in trying to discuss the phenomenological qualities of my own 
looking at abstracts, “Look at the color scheme” calls for a mode of vision 
which I call nonfocused field looking. One can blur his eyes or stand back some 
distance and try to see the whole field at once. We look at areas and regions, but 
not in a focused manner as we do with textures. One does not look at outlines or 
volumes. “Look at the texture” can take at least two modes of viewing. There is 
foveal scanning where acute vision is employed on areas or regions of texture. 
Further, there is nonfoveal field viewing where one tries to have acute vision of a 
field. A base runner at second base practices nonfoveal field looking in trying to 
watch the shortstop and pitcher at once. A more common example of nonfoveal 
field looking is when driving one is, say, trying to watch a car in the left lane, a 
car at an intersection and a bicyclist all at once. This sort of looking is like trying 
to attend to all the textures of an expanse at once. Of course, the angular size of 
the field is very important in this kind of looking, and that is one reason distance 
from a painting is often important. 

A related mode of vision, though it is nonfoveal spot or area looking instead of 
field looking, is the astronomer practicing averted vision in order to see a dim 
galaxy. “Look at line drawing” is different from the above. Here one does not 
look at areas, or even at fields, but one’s eye will often sweep the line, especially 
with regard to fine calligraphy. One looks at how sections or sweeps lead into 
and create rhythmical variations on other sections and sweeps. “Look at spatial 
composition” is nonfoveal and field, but now it is not in terms of areas of color 
or texture, but in terms of masses and qualities of equilibrium. Balance and 
proportionality are important. “Look at dynamic qualities” can be field, in 
terms of masses, but now the orientation of the masses is basic. Dynamic qualities 
also can mean linear rhythms. 

Nonfocused field looking, various kinds of nonfoveal field and foveal scanning 
is only a suggestive beginning of a list of modes of attention. The typical objects 
of attention in a painting (color schemes, drawing, composition, etc.) are 
applicable to specific abstract paintings. One looks for the tensions in Hofmann’s 
planes of color by a sort of nonfoveal, limited or narrow area field, with keen 
awareness of the edges of the planes, trying to sense the position they tend to 
assume. It is not like looking at color schemes, nor like the edges of fine 
calligraphy, One’s eye butts against the edges of Hofmann’s plane, it does not 
sweep these edges. The best description is perhaps just that we look at a plane of 
color, an outlined geometrical unit in space. Pollock’s “all-over” composition 
lends itself to many modes of attention. One can practice nonfoveal field 
looking from a distance to sense the location in front of the picture plane of the 
strands of black. One can move closer and practice foveal scanning, savoring the 
equal distribution of energy and interest. 

Perhaps, in fact, Pollock’s all-over composition is one of the best forms of 
composition for the inherently variable nature of attention to paintings. The 
extreme variability of visual attention does not disturb his painting as it does 
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traditional compositions which try to maintain special centers of interest, special 
places the eye is lead or drawn. Similarly, Barnett Newman intended for his 
works to be viewed from a distance such that they flooded the field of vision 
with subtly modulated colors. One cannot take them in at once, and, except for 
the bands, there is no place the eye rests. Thus, one falls into nonfocused 
scanning. Mark Rothko, in his very atmospheric works, also intended for the eye 
to wander in a void of color field and atmosphere, while at the same time the 
strong blended verticalities and horizontals of framing color areas emphasize the 
picture plane. 

The modes of attention could be indefinitely expanded. In fact, a subtle 
phenomenology of viewing would show peculiarities of viewing for each painter, 
and even sometimes for each painting. I have tried to show that visual attention 
is a complex of redundancies from eyeside to objectside, and to suggest some 
peculiarities of attention on both the eyeside and objectside where one attends 
to purely phenomenal qualities such as space. A characterization of these 
peculiar visual skills will be given shortly. 


Decontextualizing and Recontextualizing 


Every visual act of attention inherently involves some decontextualizing and 
some recontextualizing. Decontextualizing involves discounting certain space- 
time and aspection modes of attention, where recontextualizing involves empha- 
sizing certain part-whole and aspects, thus forming a new kind of whole, so to 
speak. In ordinary vision one decontextualizes and recontextualizes to avoid 
illusions, In judging and sensing distance on a golf course, one decontextualizes 
or discounts valleys and recontextualizes on texture and size constancy of the 
flag and green. One discounts cues one knows to be misleading and emphasizes 
cues one knows to be more reliable. If we decontextualize and recontextualize in 
certain ways we can certainly prevent the painting from working. We decontex- 
tualize from, say, surface texture in order for the spatial effects to work. Such 
decontextualizing and recontextualizing might well be noticed in changes from 
foveal and focused vision to nonfoveal field vision. We even decontextualize in 
our most synoptic modes of attention. 

One never just looks for or at a dominant effect, but at or for a dominant 
effect of some kind. As an example we might take some visual acts of attention 
with regard to a painting by Hans Hofmann. We do not look at the work simply 
as areas of color such as we might find on a test palette. We do not attend 
excessively to the texture of paint, though the picture plane is certainly impor- 
tant in Hofmann. (In fact, the texture is an important element in setting up the 
tensions Hofmann desired. The tensions come from alternating modes of vision.) 
These decontextualizings make inoperative irrelevant modes of attention. Then 
we recontextualize. We do a nonfoveal field looking with a heightened awareness 
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of the edges of the planes of color. We attend to areas or regions which include 
several planes of color. And we try to sense where the planes extend to, which 
ones tend to recede and which ones tend to advance. 

This is recontextualizing along lines that are relevant to allowing the painting 
to do what it is meant to do, a mode of attention which aids the cues which have 
a foundation in the background of normalcy. “Look at the color scheme” 
decontextualizes from textures, shapes, volumes and recontextualizes on areas, 
regions or fields of color viewed simultaneously. Areas or regions have vague 
boundaries, and thus the eyeside directions of blurred vision or distant viewing. 
Part-whole and aspects may fall into one kind of grouping before instructions on 
visual attention than after. Early in our experience of abstracts we may look for 
fragments, pieces and figments of ordinary objects. But in pure abstracts with all 
over painting there must be different elements and part-wholes. These are ways 
of decontextualizing and recontextualizing. 

Another way of stating these points is this: The thing is dissolved in abstract 
paintings, dissolved at least into figments and fragments. The thing was begin- 
ning to be dissolved by traditional composition and color schemes where the 
purposes of art overrode considerations of representational fidelity. When the 
dominant consideration becomes the elements of art, the painter can still have as 
much or as little relation to nature and human nature as he wishes, and he does 
this by exploiting or limiting factors in the background of normalcy. At the 
extreme of elements, where the thing is completely dissolved because of con- 
siderations of art, remember that we are still looking at the results of acts of 
painting, and these tie in richly with normal acts. Although some acts are not 
gesture-like, as in choosing, placing, arranging colors in paintings by Piet Mon- 
drian, with regard to Pollock, however, the lines are acts of gesture. They 
constitute a celebration of the creative act, and we sense a painting by Pollock 
much more kinesthetically than we do one by Mondrain. Some modes of 
attention are peculiar in that they demonstrate the dominance of artistic 
technique over the realism of what is depicted; thus, they emphasize the 
elements of art rather than the content of the painting. For example, cross- 
category modes of attention requiring one to notice the shape of a hole in a 
Dubuffet exhibit what we mean by decontextualizing and recontextualizing. 

Since all visual acts involve decontextualizing and recontextualizing, there is 
nothing odd about contextualism in painting. Thus some of the bite is taken out of 
the objection that causal correlations, though operative in abstract painting, are 
rendered useless as generalizations because of contextualism. A form of this 
objection occurs below, however, in saying that knowledge of causal correlations 
will not refute or verify a specific critical explanation or characterization of 
effects, but that such knowledge is relevant at the level of analytically learning a 
general visual act. One looks at many paintings in terms of the knowledge 
expressed in the proposition that “texture creates space,” and with practice one 


152 MARCUS HESTER 


learns to sense space in these terms. But the deduction: “Texture creates space. 
There is texture in Pollock. Therefore there is pictorial space in Pollock” will not 
do. However, having assimilated the generalization that texture creates space 
into a practiced visual act does increase our sensibility, and we do decide on the 
basis of our sensibility whether Pollock creates space or not. The confidence in 
our judgment rises with the practice of our visual act of sensing space due to 
texture. 


Synoptic and analytic modes of attention. Decontextualizing and recontextu- 
alizing of various sorts can be stated in terms of the distinction of visual causes 
and effects. For the more analytical type of attention, one attends to the visual 
cause and expects the effect to follow. For example, one attends to texture with 
the expectation that a sense of space will follow. Our knowledge of causal 
correlations, thus, is assimilated into this sort of analytical attention. Instead, we 
often practice a more synoptic mode of attention, doing something like attend- 
ing to the space without being aware of what causes the sense of space. Even 
though in synoptic attention we may not notice texture as a cue to space, we no 
doubt see, in some weaker sense than explicit visual noticing, the texture, for 
texture can cause us to sense space even when we do not explicitly notice it. No 
doubt when we sense phenomenal qualities in a painting there is a cause for our 
sensing just as there is a cause of the Miiller—Lyer illusion even though we cannot 
cite the cues causing the illusion. Illusions need causal explanation. Often critics 
just characterize the kind of space, and we simply agree or disagree on the 
characterization so far as our sensibility goes. Obviously both kinds of attention 
decontextualize and recontextualize on cues such that, given a background of 
normalcy, the painting gives a sense of space. Synoptic and analytical modes of 
attention are further examined below. 


SENSIBILITY 


Attention with any degree of complexity is like any learned skill. At first it is 
clumsy, stiff, academic and unresponsive. Later it can become smooth and there 
is a noticeable increase in perceptual refinement abilities. Attention is learning to 
see in certain terms. Sensibility just means the development of perceptual 
refinement abilities subsequent upon practice of certain modes of attention. 
Perceptual refinement abilities most literally are specifiable in terms such as 
discrimination, acuity, and sensitivity. But with regard to skilled acts like averted 
vision, such terms do not fully describe the skill. Averted vision requires motor 
skill in order to resist foveating the object of attention. Further, it requires 
ability to attend to a nonfoveal region. Thus even though averted vision increases 
sensitivity, sensitivity alone is not a sufficient description of the act. Unidimen- 
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sional concepts such as sensitivity, discrimination or acuity are not sufficient 
specifications of any visual skill. Analogously, increase in literal perceptual 
refinement abilities such as discrimination and sensitivity are relevant to in- 
creases in sensibility, though not of course definitive of sensibility. Perhaps it 
does not need arguing that habitual attention to color increases one’s discrimina- 
tive ability, and that such discrimination is relevant to sensibility to color 
schemes. A critic who could only discriminate a few colors would hardly be 
taken seriously as an analyst of color schemes. Excellent discrimination is a 
necessary condition for any high degree of sensibility to color schemes. Of 
course, discrimination alone is not sufficient; a fuller characterization of various 
forms of sensibility will take us into various peculiarities of these kinds of visual 
acts, into differences alluded to earlier as the distinction between ordinary 
aspects and elements of art. 

It will perhaps be helpful to state more exactly what kinds of perceptual 
refinement abilities are appropriate to abstract paintings. This is especially 
imperative since even more literal visual skills such as averted vision cannot be 
characterized in terms of discrimination, acuity, or sensitivity alone. An answer 
to this question has already been given in the discussion of eyeside and object- 
side techniques and categories for perceiving abstract paintings. There are a 
number of eyeside techniques such as blurred vision and various kinds of field 
looking. Objectside there are matters such as sensing the position of planes of 
color and sensing tactile-kinesthetic qualities. In a sense the only way of 
specifying the perceptual refinement abilities is to go through a series of abstract 
paintings and comment on different abilities needed for each. This has already 
been done in the section just mentioned (eyeside and objectside factors in 
attention), and now it remains only to reemphasize the plurality and open- 
endedness of the perceptual refinement abilities needed for abstract paintings. 

Defining the perceptual refinement abilities in terms of the object of attention 
is not so circular as it may seem. We now think of emotions as being constituted 
by the object of emotion and not, as in dualism, as being merely contingently 
related to the object (see Kenny, 1964). Analogously, one cannot specify the 
kind of attention apart from the object of attention. Painting effects are 
somewhat peculiar and unique even though they exploit visual normalcy for 
their very existence, and this also has been indicated in the comments on 
painting effects as special sensibles. Thus it seems permissible to express percep- 
tual refinement abilities as increased ability to sense complex and hyphenated 
effects such as the position-colors-tend-to-assume-spatially or the tactile- 
kinesthetic-qualities-of-forms. No skills can be specified in terms of isolated 
abilities, Skills are specified in terms of their typical activity even though 
isolated aspects of the skill may be common to many activities. The structure of 
each activity recombines isolated skills from other activities. Sensibilities of 
various kinds are related to other visual acts, but the distinctness of sensing 
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effects as elements of art recombines ordinary skills in unique ways. Good critics 
have highly developed perceptual refinement abilities to effects and impressions 
of paintings which is part of what is meant by saying they have sensibility. 


Sensibility and Noticing 


Attention is distinguished from noticing. One can pay attention but still not 
notice something. Attention is a preparation for acts of seeing, sensing, and 
noticing. “Notice” as a command form does mean to pay attention, but paying 
attention does not guarantee one will notice in the success sense. Attention is 
related to the general state of being attentive, being observant, being alert. One 
can even notice without being attentive, as in “my attention was aroused,” “I 
inadvertantly saw,” “something caught my eye,” or “my eye was drawn.” But 
even though ordinary noticing can take place without attention, involuntarily, 
noticing a subtlety can hardly take place without preparation, thus the oddity of 
an unexperienced person saying, “I suddenly noticed the cubistic space of a 
Pollock.” Since noticing subtle effects of paintings requires sensibility, such 
noticing could not be a form of noticing without having cultivated certain visual 
skills. We can inadvertently or accidentally notice only those things we can see 
with ordinary perceptual skills. One does not inadvertently notice the details of 
a dim galaxy or the spatial structure of a Pollock. 

Noticing involes a sort of uptake on the part of the object of attention. In a 
purposive activity like looking at or for something, there is a give and take of 
vision. We practice space-time reference, aspection and modes of attention 
visually, and often the object rewards, so to speak, our looking at or for. 
Sometimes the uptake on the part of the object is very minimal as in “notice the 
ted” said of a very ordinary red; however, this sort of noticing is really noting. 
Moreover, this sort of noticing is close to ordinary referring. The only kind of 
attention referred to here is that which occurs in aesthetic contexts. Here the 
object requires considerable perceptual subtlety, and, therefore, there is room 
for more uptake than in ordinary aspection. “Notice the blue” uttered in front 
of a painting by Newman means “Notice the elusive, unique, peculiar, atmos- 
pheric blue!” In fact, if in an art museum I say “notice the blue” and you 
simply see an ordinary blue you will likely say something like “I don’t see 
what you mean. It looks like a very ordinary blue to me.” Likewise, if you do 
see what I mean when I say “notice Newman’s blue” you will have a more 
dramatic uptake than in ordinary attention and you will say “I see what you 
mean” as a sign that the painting has rewarded your attention. 

There exists a variety of uptakes bridging, so to speak, the final gap between 
the verbal channels of attention and the particular visual object actually seen. 
The nature of the verbal act and the verbal context tell us what kind of uptake is 
expected. The object of attention is a particular as characterized or described. 
Some of the kinds of uptake which the particular provides within the channels 
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of trained attention are these: In characterizing a spatial effect the critic is often 
trying to articulate the uniqueness, the je ne sais quoi of the painting space. In 
characterizing one tries to express that which we could see before the charac- 
terization but could not say. Characterization in articulating subtle effects 
achieves Isenberg’s ideal of criticism in bringing about communication at the 
level of the senses (see Isenberg, 1949).? One sort of uptake then is savoring 
uniqueness, and “I see what you mean” can mean “I grasp the elusive visual 
quality of just this work.” The uptake could be very different if we are to see 
the painting as belonging to a school or type. “Notice the late cubistic space in 
Pollock” gets a very different sort of uptake from “Notice the unique and 
peculiar space of Pollock.” “Notice that Pollock’s space is due to overlapping 
strands of paint” gets a third sort of uptake, for here we get an abstractable 
schemata by which we can look at other paintings. Here we notice what the 
space is due to and not the uniqueness of the space or the type of space. “Notice 
the intensity of Pollock’s paint gestures” is different still. Uptakes then can 
include seeing uniqueness, typing, explaining, and characterizing the emotive 
tone. 

In summary, critics practice many verbal acts (describing, analyzing, charac- 
terizing, interpreting, explaining) in many contexts for which there are a wide 
variety of uptakes. 


Sensibility and Perception 


A basic difference between sensibility and averted vision is that sensibility 
includes sensing phenomenal effects. But even sensibility to effects is perceptual 
attention. To notice a given effect one must attend in the proper visual mode. 
Indeed, the logic of sensing as defined here entails perceiving. I would deny that 
one was sensing an effect of an abstract painting unless one was perceiving the 
visual cause of that effect. But this is also true of ordinary visual sensibilities. 
Consider for example one’s sense of distance, balance, direction, mass, etc. These 
are not internal senses, rather each is a mode of perceptual awareness that is 
exercised in specific activities with respect to specific objects. In other words, 
they are object and activity specified. For example, a sense of direction may 
involve perceptual skills such as perceiving sun and star positions, feeling prevail- 
ing winds, recognizing familiar landscapes from many angles, possessing an 
accurate visual memory, etc. 

Sensibility is a mode of visual attention in that it implies an ability to use the 
eyes in a skillful way so as to sense the effects and impressions of paintings. 


2See also Wittgenstein’s (1953) comments on the contrast between the color known to 
everyone and the particular impression of a color Iam now getting. “I don’t turn the same kind 
of attention on the colour in the two cases. When I mean the colour impression that (as I 
should like to say) belongs to me alone I immerse myself in the colour—rather like when I 
‘cannot get my fill of a colour’ [p. 277].” 
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These effects and impressions, however, cannot be specified or defined apart 
from their visual causes. 

Further, sensing is a mode of perceptual attention in the sense that the effects 
are achieved by exciting the background of perceptual normalcy. This is a key 
difference between sensibility and various forms of projection. Seeing the 
Miiller—Lyer illusion is perceiving. One does not perceive the illusory lengths in 
the full success sense of “perceive,” but one of the lines certainly looks longer 
than the other. 

There is no one relation between seeing and sensing, though all types of 
sensing are modes of perception. The richness of their relationship has been 
hinted at several times before. Some of the differences are indicated by ap- 
pearance concepts. Complements really look more intense when near each other, 
but paintings do not really look spatial. Several factors may be operating here. 
Perhaps we are more reluctant to attribute “really looks” to an effect which is 
not a special sensible of vision. Also relevant is what we might call the phe- 
nomenological quality of the look of the object. Giving a sense of space does not 
have the same feel visually as color complements appearing more intense. 
Further, giving a sense of space is not like really looking spatial in the sense that 
we might confuse picture space with real space. Matters such as these provide 
reason for preferring the idioms “gives a sense of” or “gives an impression of” to 
concepts which carry a lot of ordinary epistemological baggage such as “looks,” 
“seems,” and “appears.” Finally, as emphasized at the outset, “causal correla- 
tion” is used as a covering concept. There are important differences, not 
explored here, between kinds of visual causes, kinds of effects, and even kinds of 
relationship between visual causes and effects. Very different qualities such as 
color intensification, a sense of space, a sense of motion, a sense of light, etc., 
could not all be expected to have the same sort of relationship to visual causes. 
There is no single relation existing between seeing and sensing, rather all the 
causal correlations mentioned are activated in modes of visual attention that are 
based in visual normalcy. The modes of visual attention in sensibility are unusual 
in that one is not trying to discount the illusionistic tendencies as in ordinary 
vision (going from looks to how things really are), but one is trying to sense the 
uniqueness of the phenomenal world of this painting. A definite or distinct 
visual impression is the goal of this sort of seeing, but it is a mode of seeing. 

The uptake to be expected from these kinds of visual acts is quite different 
from identifying types or recognizing particulars. Averted vision, however, does 
concern recognizing and identifying. There are clearly visual acts which specifi- 
cally have to do with the categories of vision, the spatial “jelling” of the visual 
world, without which no animal could see at all. Seeing is less oriented to type 
and more oriented to a general category like seeing three dimensional objects in 
space. My dog can see a tennis ball but not identify it in the full sense of type 
identification; moreover his seeing consists in knowing the ball is a solid bouncy 
object, and acting as though he knows this. He does not know the ball is for 
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playing tennis. Another way of putting this is that sensing space effects induces 
the very structure of the visual world of the painting, and ordinary spatial cues 
induces the very structure of the ordinary visual world. The falling into place 
of the structure of the visual world, its perceptual “jelling,” is a different visual 
act from recognizing and identifying particulars or types of ordinary objects 
within the spatial world. 

Orientation in space is also categorical. The parallel here to Kant is deliberate. 
Space, orientation of objects in space, the essence of a three dimensional object 
are more like forms of intuition than like particulars and types. Abstract 
paintings activate these very general categorical aspects of vision. The emotive 
tones also are attached to such “spaces” and “things.” Thus their generality, 
but vagueness. 

There are also acts on “objects” within picture space more analogous to 
recognizing and identifying. Shapes and their orientation sometimes tend to 
express tactile qualities. But here again there is something more universal than 
identifying a type or particular, namely, a bare seeing in more abstract and 
universal categories, Because abstract painters excite the more universal aspects 
of vision such as texture, overlapping and atmosphere to create space, these 
determine the very form of seeing, and not just the content of seeing a particular 
or type. 

The full force of pure abstracts, that is those lacking recognizable objects, is now 
evident: In this domain the primary visual acts of recognition and identification 
do not operate. At best the “things” in abstract paintings are what Austin 
(1962) called “odd things” such as planes, areas, and atmospheres. However, 
since abstracts operate on the level of “odd” things rather than ordinary objects, 
there is no way to evaluate the perceptual “Selling” that occurs when the 
painting succeeds. Thus, the phenomenal world of abstract paintings is funda- 
mentally odd and subject to much imprecision in judgment. 

In summary, there are structuring acts such as jelling whereby the painting sets 
up its basic spatial structure when correctly perceived. There are also visual acts 
within the picture space having to do with “recognition” or “identification” of 
“things” such as blobs, atmospheres, planes, or lines. Identifying and recognizing 
have to be put in quotes because they are not ordinary type and particular 
identification. Instead they have to do with very general schemata of things, 
schemata such as shapes, forms, planes, and lines in picture space. The “identi- 
fication” and “recognition” of things in abstract paintings thus sets into opera- 
tion our general knowledge of categories of things instead of setting into 
operation, as genuine identification and recognition do, our knowledge of actual 
particulars and types. Thus, one can recognize that abstract forms, shapes, lines, 
colors, etc. sometimes are abstracted from nature, and when so they retain only 
the most general categories of things. Of course, a painter can maximize or 
minimize the relation of his abstract elements to nature, but it is important to 
emphasize that paintings without recognizable objects or identifiable types can 
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set in motion our abstract schema of things. As Kandinsky noted, it is more 
difficult to expurge nature from painting than it is to imitate her (Kandinsky, 
1947). Basic visual categories function with the least visual cues. Further the 
points are explicitly relevant to understanding the relation of seeing and sensing: 
Both the structuring acts such as jelling of the basic painting space and more 
limited acts such as “recognition” or “identification” are visual acts. With regard 
to picture space, one does perceive visual causes such as texture, overlapping, 
and atmosphere and sense the space so set up. The visual act of structuring even 
with regard to effects and impressions of space (in contrast to actual space) is a 
visual act, in a certain mode, and with visual skills. The visual acts which are 
close to ordinary visual acts of recognizing and identification are visual acts in 
the genuine sense because one does perceive shapes, forms, orientations of 
shapes and forms, etc., in a picture space. Other acts of sensing such as the 
sensing of natural groupings (based perhaps on Gestaltist principles of grouping) 
are also genuine visual acts. Of course, as with ordinary vision, there is a rich 
interplay between structuring acts such as sensing space due to texture and 
specific acts of “identification.” There is much redundancy and mutual rein- 
forcing which is visually sorted into a coherent “hypothesis” or “reading” of 
space. Sensibility to these sorts of things is a perceptual refinement ability. Thus 
sensing is essentially related to seeing. 


Sensibility to the Elements of Art 


It should be emphasized that modes of attention, and thus sensibilities, are 
developed in terms of aesthetic positions such as purism, expressionism and 
abstract naturalism; thus sensitive vision is developed at least as much by 
experience with works of art as by ordinary visual acts. The elements of art 
cannot be considered, as has been done thus far, as abstractions such as color 
schemes and drawing divorced from aesthetic positions. No critic just looks at a 
color scheme, but looks at color schemes from a particular point of view such as 
purism, expressionism, or abstract naturalism. Moreover, these various aesthetic 
positions differentially structure the complex ordinary abilities carried over into 
an aesthetic context. Different kinds of sensibility do not have the same 
structure. For instance, a purist looks at color schemes in terms of harmonies, 
thythms, color qualities such as luminosity, spatial positions of color planes, 
relations of color schemes to other pure painting elements such as shapes, and 
various tensions and reinforcing of all these levels of color schemes. The 
expressionist discusses sensibility and attention in many of the same terms, but 
with further emphasis on emotive tones, moods, expressive power or impact, 
suggestiveness, mythological—archetypical meanings, depth psychology and sym- 
bolisms of color schemes. The abstract naturalist also shares many terms of 
attention with the purist, but there is further attention to various sorts of 
resonances of abstract elements with nature, including suggestions, hints in the 
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color schemes to colors of things in nature and whole relational resonances 
between the painting considered as a microcosm or metaphor to nature. Each of 
the three positions mentioned thus calls for a complex and hyphenated mode of 
attention and sensibility. Subsequent to the exercise of trained sensibilities of 
various kinds are pleasures peculiar to each, purism giving a pleasure in sensing 
subtleties of painting elements considered in themselves, expressionism giving a 
pleasure perhaps like catharsis, abstract naturalism giving a pleasure in meta- 
physical insight. It is further evident that the kinds of sensing abilities called for 
in these aesthetic positions are very different. The continuities and differences 
between kinds of sensibilities and ordinary visual skills are also very different. 
The purist might emphasize sensitivities which have some continuity with 
ordinary sensibilities to harmony, balance, and rhythm. The expressionist con- 
nects sensibilities to ordinary sensibilities to moods and emotive tones of color, 
gestures, etc. 

We can make these further remarks about the relation of the natural eye and 
the cultural eye both of which are of course idealizations. We can very roughly 
divide the things which come to pervade and constitute attention into verbal and 
nonverbal matters, and both verbal and nonverbal matters structure ordinary and 
cultural vision. 

In the ordinary visual world, we see objects which have all sorts of other 
qualities due to our expectancies of what these objects do in our activities. We 
see unstable forms and shapes at angles which tend to collapse. We see objects 
and expect tactile kinaesthetic experiences from the way they look. We see odd 
things such as clouds and blood, and these have wider meaning. These are 
nonverbal matters pervading ordinary visual attention. Also pervading ordinary 
attention are aspection terms such as “color,” “texture,” “mass,” etc. Both sorts 
of matters can come to animate the abstract and odd “things,” the very 
schematized and categorical “things” in abstracts. A form may be animated with 
tactile kinaesthetic qualities. A drop of red may come to be animated with visual 
images of blood and with the meaning of blood. Attention to color schemes is 
pervaded by the ordinary meanings of “color.” Put differently, from acting and 
talking in the visual world, visual images and meanings are stored in all levels and 
kinds of memory, which can be deliberately activated by cues employed by the 
painter, although in the case of abstracts there is a deliberate limitation on the 
range of cues since no ordinary recognizable objects are present. 

The structure of the abstract and the “things” in the abstract almost inevitably 
activate the more generalizable and categorical aspects of ordinary visual objects. 
Since our experience of paintings comes to be stored in all levels and kinds of 
memory, then Pollock’s backgrounds may remind us of atmospheres in nature 
and thus come to be animated by images of atmosphere. On the other hand, 
Pollock’s spaces may remind us of late cubism and thus come to be animated by 
images stored from viewing cubistic works. From our visual experience of 
paintings and our knowledge of the process of painting, the image may come to 
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be animated with constructionists qualities. Unlike natural objects, a painting is 
a constructed and made object. Pollock’s drip gestures may come to be animated 
with what it would feel like to make those gestures, with constructionists ideas 
of this painting. Further, not only do these sorts of nonverbal matters from the 
history of painting come to animate our present viewing of Pollock, but verbal 
matters do too, as in titles and specific declarations of intention given by 
Pollock. Specific verbal declarations of intention raise the very complex problem 
of the intentional fallacy (if it is a fallacy), and I shall return to this problem 
shortly. What “background of normalcy” must now mean is not just a subject 
with normalcy due to ordinary visual acts, but a normalcy that includes various 
kinds of memory derived from skills and practice on paradigm art works. 

There are many levels of memory operating in the natural eye and cultural eye, 
and some of these levels are more easily put out of gear than others. Visual 
habits are very hard to put out of gear, while recognizable objects can be 
eliminated. Other levels of natural visual memory are not put out of operation 
by abstract paintings, and perhaps these levels are better exploited than ignored. 
They are of course very greatly exploited in the plastic qualities in abstract 
expressionism, in the creation of spaces by using cues from normalcy. The 
painting image is bound to be animated by some forms of memory from 
normalcy. When we go into an art museum, we can consciously employ certain 
levels of cultural memory, like images from members of a related school, but 
there are no.doubt levels of the natural eye which are less controllable and will 
operate willy-nilly. 

It is a common experience that the order of seeing a series of paintings 
determines striking similarities or differences. The way a painting looks depends 
on which paintings one has just seen. Probably this is true of what we have just 
seen in the normal world. Seeing an ice cream sundae makes us see some 
Oldenburgs differently. Every viewing is an occasion or occurrence, and our 
active memory and less consciously aware levels of memory no doubt all 
operate. What we will say at any one time depends on our active vocabulary at 
that time, and analogously how the painting will appear depends on active 
elements in our visual memories on many levels. There are concepts, verbal 
statements of intent, visual images, memory on many levels—all operating to 
form the painting I now see and the impressions, effects, suggestions, and 
evocations which come to animate this particular viewing depend on active 
memory, both cultural and natural, at the time of viewing. This need not be 
hopelessly subjective, for given an ideal of the natural eye, cultural eye, extent 
and relation of them, we can prune away extraneous elements in our active 
memory and heighten others. 

The line between the cultural eye and natural eye is obviously fluid, some 
viewers having normal ordinary visual experiences but almost no training of the 
eye on paintings and others having a great deal of training on how to view 
paintings. Further, some aesthetic positions and painters deliberately exploit the 
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natural eye by activating normalcy, and others deliberately limit the influence of 
normal seeing. Some more investigation of the relation of the cultural eye and 
natural eye is called for, but these few remarks can be made here. Obviously for 
most viewers, the natural eye has considerable advantage in terms of amount of 
exposure, and thus we would expect the cultural eye to be rather pervaded with 
habits, knowledge and kinds of attention from the ordinary visual world, per- 
vaded as a second language often is by one’s native tongue. Presumably, with 
sufficiently radical changes in our visual environment (given some liberties with 
the nature of the eye also), we could learn to sense genuinely the purported 
effects of any abstract painting, for we could devise an environment in which the 
background of “normalcy” would be contrived to make the painting work. 

Obviously this is never done except in the very limited sense that the en- 
thusiasts of a certain painter might get so much visual exposure to his work as to 
make the purported effects actually work, at least in the eye of a narrow coterie 
or clique. For the more ordinary viewer the relation of the cultural eye and 
natural eye is more like this: For the nonverbal aspects of attention to a 
painting, we would expect these to be heavily pervaded by the nonverbal aspects 
of the natural eye. But we need to specify how pervasive the influence is by 
making a further distinction in types of nonverbal matters in normalcy: 


1. What has been called a background of normalcy consists of extremely 
basic things like scanning habits, general knowledge of correlation of looks and 
tactile kinaesthetic qualities. Here there obviously has to be a very great 
dependence of the cultural eye on the natural eye, for the very creation of 
effects depends on painting cues selectively activating the background of nor- 
malcy in order to create effects. 

2. With regard to modes and centers of attention in normalcy, things which 
draw, stop or lead our eye, and the relation of modes and centers of attention 
learned from looking at paintings, there is a much looser relationship. 


Normalcy in attention does not exert such influence over the cultural eye’s 
attention and center of attention for the background of normalcy concerning 
attention is not presupposed and exploited to create effects. The following 
speculation on the relation of these two modes of attention is offered. When the 
element of art to which we have learned to attend is significantly like an aspect 
of nature to which we ordinarily attend, there is no doubt considerable 
influence of the natural eye on the cultural eye. For example, color-in-space-con- 
sidered-puristically is significantly similar to the color aspect, and thus we can 
expect that matters which draw our attention to color will also operate in 
attention to color schemes, that is, attention to an aspect of nature will 
somewhat pervade attention to an element of art. 

But color schemes also have autonomy, for we look at color schemes by a sort 
of nonfocused field viewing technique in an attempt to see the relation of color 
areas to each other. These differences in attention to color and color schemes are 
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of course reflected in the pleasures of savoring colors versus the pleasures of 
sensing a color scheme. For an element of art such as line, which is intrinsically 
connected to the art of drawing, there is perhaps even more independence of the 
cultural eye’s attention from that of the natural eye. Line drawing is not the 
natural aspect of outline, especially in abstract paintings. When there is a 
significant similarity between the visual cause of some effect and those factors 
attracting normal attention, we can expect natural attention to be exploited in 
order to cause us naturally to notice an element of art. Principles governing 
normal vision, such as Gestalt formation and other factors that play a role in 
structuring and focusing normal attention, no doubt function in the painting 
context. 

These remarks can now be related to the above comments on visual causes and 
contextualism. Contextualism is lessened when our attention is naturally drawn, 
by genuine sensing, to modes and centers of attention by which painting effects 
are created. It is true, however, that subtle effects do not demand attention in 
ways which we would describe as striking or compelling, and thus we have to 
learn to sense these effects. Further, our attention is often attracted to an 
element of art because of our sense of craft to that element of art, and this sense 
of craft, as analyzed below, is developed in a cultural and not a natural context. 
These speculations need to be carried further. 

With regard to verbal matters affecting the relation of the natural eye and the 
cultural eye, a different set of continuities and differences exist. We can apply 
verbal directions regarding how to look at a painting with great ease and 
flexibility; in fact, appreciation may be no more than verbal knowledge of a 
painting’s purpose and looking at it in those terms. But, alas, we cannot see and 
sense paintings with such flexibility for sensibility as perceptual refinement 
ability is a set of skills, and like all skills it has the status of ingrained habits 
which are practiced in, and respond to, the context of their usual employment. 
Eyeside matters such as types of techniques and other matters in seeing such as 
scanning habits and general knowledge which works very directly and im- 
mediately in structuring the spatial world cannot be changed as easily as verbal 
directions on how to appreciate a painting. Thus we have the problem of the 
relation of appreciation and sensibility. 

This very elastic postpuristic concept of sensibility is not even aesthetic. No 
position on purism, expressionism or abstract naturalism is assumed. Further, no 
claim is being made about the limits of sensibility, for the subject of suggestions 
and evocations of abstract paintings have not been dealt with, nor have we 
discussed the degree to which natural and conventional meanings can truly be 
sensed. Many modern critics and painters agree that one must not bring knowl- 
edge of symbolism, purposes and factual matters known about nature to a 
painting, but must be able to sense a painting’s wider qualities. I have simply 
tried to show what sensibility is at the level of philosophy of mind. It consists of 
various kinds of perceptual refinement abilities; thus I have tried to clarify what 
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we mean when we say a painting “gives a sense of ” or “gives an impression 
of ” or “has the effect of .” If investigation of vision shows that 
abstract paintings really cannot carry, at the level of sensibility, the rich sort of 
metaphysical and expressionistic content purported for these paintings, we shall 
have to say that these paintings do not truly give a sense of many of the 
purported suggestions and evocations. 


Sensibility and Appreciation 


At the verbal level, appreciation could be as extensive as art, for we can easily 
shift verbal instructions. Thus we can appreciate the whole history of western art 
as in a book like Gombrich’s (1972) The Story of Art. There is, however, a 
retarding stability in our vision. The nature of the eye, scanning habits, basic 
factors of orientation, general cues to the structure of the visual world, general 
knowledge of abstract and odd things, these all have a considerable inertia of 
their own. Modes of sensing cannot be put on with the same ease and frequency 
as appreciation, for each mode of sensing requires specializing in certain skills 
that may be incompatible with those of other modes of sensing. We can no more 
simply decide at this moment to sense a painting in its relevant mode than we 
can decide suddenly to be a skillful astronomical observer or biological micro- 
scopist. We can decide to seek the kind of sensibility required, but we cannot 
achieve sensibilities as easily as we can appreciation. 

It is, of course, often surprising to discover how fruitful it is to see and sense a 
painting in a mode foreign to that intended by its creator; on the other hand, 
there are clearly times when a mode of sensing works against the work of art. 
Every serious painter can ask that we appreciate his point of view, but the form of 
sensibility we develop must respond to other considerations as well. At the level 
of appreciation we can have both maximum sympathy and maximum distribu- 
tion with regard to other paintings, but it is doubtful whether we can have this 
at the level of sensibility. Here some hard choices arise, for our sensing is simply 
the viewer counterpart of the question of what art should be. 

The distinction between sensibility and appreciation, however, can be some- 
what overdrawn. A consideration of what can be called synoptic and analytic 
modes of attention will help soften this distinction, for in the more synoptic 
modes of attention there is considerable overlap of appreciation and sensibility. 
In fact, appreciation is essentially related to sensing, for sensing is assimilation at 
the level of actually seeing what were perhaps merely verbal formulations of 
purpose at the level of appreciation. 

Where analytic modes of attention may become more appropriate to extended 
viewing of a painting, since it has to do with actually sensing the cause of effects 
present, synoptic modes of attention are usually more appropriate in the initial 
stages of viewing. Up to this point we have dealt primarily with the nature of 


analytic modes; we now need to consider synoptic modes. 
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Synoptic modes involve preliminary sampling of what the painter attended to 
most carefully, where and on what he or she expended most effort, and the 
nature of his or her primary purpose. This kind of sensing does not rely just 
upon perceptual refinement abilities, but can be increased by knowledge of the 
history of art and a feeling for crafts of various sorts. The purely natural eye 
certainly would not be drawn to the main purposes in a painting by the same 
cues which ordinarily attract attention or draw attention or stop attention, The 
non-purely visual sensing of what a painter intends requires a flexible entertain- 
ment of purposes, a part of which is a sense of various sorts of craft or 
accomplishment. One would hardly notice the quality of a color scheme or 
drawing without having experience with quality of drawing or color schemes in 
other works. One gets to know what quality looks like. One entertains various 
elements of art from various points of view until one senses some quality in the 
value and craft sense. There is even a sort of uptake here, the painting answers, 
so to speak, to some mode of sensibility (purism, expressionism, or abstract 
naturalism). The uptake is not purely visual as in “notice the uniqueness of 
Pollock’s space” but is an uptake in the sense of the painting seeming richer and 
fuller from some aesthetic point of view. Originally, we may have expected a 
more innocent eye with regard to painting than we have in ordinary life, 
however, we must now admit that training in appreciation helps to develop 
sensibility to the elements of art in a way analogous to how practical experience 
helps develop part-whole and aspection skills. 

Sensing dominant effects or impressions is an unspecialized intermediate form 
of sensing, a kind of sensing of fields of force, places to which attention is 
drawn, aspects or elements which draw attention. Dominant effect denotes the 
main and dominant emergent effect on a whole series of subordinated effects. 
Thus synoptic attention involves a true visual sensing of dominant effects and a 
sensing of main purposes conditioned by knowledge of the history of art. No 
doubt sensing dominant effects is not only a matter of causal field sensitivity 
based on normalcy but is conditioned by appreciation, that is by recognizing 
that the context of the painting is another element of art. Knowing to entertain 
a rich range of elements of art helps one sense the real dominant effect for one is 
less likely to be blind to some overriding consideration in that particular painting 
context. If one knows how to sense space in terms of texture, overlapping, 
atmospheres and color planes he is less likely to ignore the context than one who 
senses space only in terms of overlapping. 

Synoptic attention involves feeling out the basic nature of the painting world, 
while specialized perceptual refinement abilities are for only certain painting 
worlds. I do not make synoptic and initial attention entirely a matter of 
perceptual sensing of effects. Synoptic attention then involves sensing main 
purposes, which does not involve perceptual refinement alone but is more develop- 
ed by appreciation, and sensing dominant effects used as means to main purposes. 
The latter is a perceptual sensing of a highly contextual kind, involving sensing 
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natural Gestalting and aspection. It is more like field looking for subtle ten- 
dencies and counter tendencies visually. The perceptual refinement abilities 
developed with regard to dominant effects and impressions do not involve 
matters such as discrimination, sensitivity and acuity. Rather they are perceptual 
refinement abilities to emergent effects in what are sometimes elaborate hier- 
archies of effects. Sensing dominant effects is obviously very basic in dealing 
with the problem of contextualism. 

Presumably, sensing dominant effects would never develop without knowledge 
of typical purposes of painters, but this does not mean that one cannot, with 
practice, develop ability to sense dominant effects or impressions, and thus the 
form of synoptic sensing becomes a mode of true perceptual sensing. Many 
modes of attention, as well as many other skills, would be almost impossible to 
learn without verbal teaching, but attention and sensibility can develop integrity 
of their own and become skilled sensing. Averted vision is seeing, though one 
would never naturally notice dim galaxies. Likewise sensing dominant effects is a 
highly contextual sensing, though one would never naturally notice them, and 
thus a dominant effect is rarely a striking or compelling effect as the Müller- 
Lyer illusion is a striking or compelling illusion. To merely notice some subtle 
visual effects takes training. 

Correlated to visual acts of synoptic versus analytical types, are critical verbal 
acts of synoptic (characterizing) versus analytical types (explaining effects). Of 
course, just as there is no unique level of critical verbal act, some critics doing a 
lot of characterizing of effects and little explaining, there is no unique level of 
visual act, whether synoptic or analytical. We largely learn the visual acts from 
the verbal acts of various sorts. This is the significance of relating attention to 
ostensive defining. We quite literally, as argued earlier, learn to see in certain 
terms, We assimilate “categories” in a unified eyeside-objectside act of vision. 
We can now add that we learn to see at certain levels from analytical attention to 
a detail or part to synoptic attention to wholes which make up elements of art. 

Sensing what is important in a painting is not being visually sensitive so much 
as being perceptive in a wider sense. Perceptiveness includes discovering how to 
talk about a painting, possessing or developing a sensitivity to its purpose, while 
sensitivity to visual effects may be phenomenal, visual or even optical. Having 
great perceptiveness to a painting is not the same as merely seeing how the 
painting really looks. “Really looks” is a claim about visual functioning, and 
many perceptive remarks are not of this sort. Perceptiveness involves something 
other than sensibility, sometimes it involves saying things which after being said 
are obvious and important, such as “painter x always shows his subjects 
frontally” or “after period x, the painter used only analogue colors.” “Hof- 
mann’s color schemes are never moody, but show a jocund and cheerful 
personality” is a perceptive remark, but not one requiring visual sensitivity. This 
kind of perceptiveness is very important for describing main purposes of a 
painter. It goes with sensitivity to purpose and to style. 
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Another kind of perceptiveness is to the quality of work. One’s attention 
might be drawn to the quality of drawing because of knowledge of the state of 
the art, but this is not like noticing an effect. One can see quality when one’s eye 
has art historical training and exposure. A critic needs this sort of perceptiveness, 
but it is different from being sensitive in some specific visual regard. There are 
many kinds of factors in what we might call a standard of performance. One 
might see quality because of familiarity with the usual level of skill in that kind 
of material. Departure from a standard of performance might well cause one 
familiar with it to notice important things about a painter, a noticing which 
would be perceptive but not visually sensitive. Often stating what the painter 
obviously did involves entertaining a wide range of what could have been done 
and what was not done. What is left out is important, especially if it pervades, so 
to speak, what was chosen. Often a choice is made polemically against other 
choices. Choices are made meaningful by knowing the possibilities or alternatives 
considered. Certainly fairness involves sensing what the painter emphasized. 
Knowledge of art history, books like Gombrich’s (1972) Story of Art, certainly 
expand the number of purposes one considers when viewing a painting, and 
certain kinds of perceptiveness are obviously increased by consideration of a 
wide and flexible range of possibilities. Perhaps with respect to modern art being 
literate is not required, but what is required is almost a technical knowledge of 
painting. Acts of vision may be even more important than in traditional painting 
where much value depended on subject matter. Very sophisticated modes of 
attention are required, and these require knowledge of a wide range of possibili- 
ties. 


Sensibility and Philosophy of Mind 


It is very difficult to specify what sensibility is because none of the well-de- 
veloped categories in philosophy of mind, so far as I know, describe it. It is 
neither an emotion, sensation, perceptual faculty, nor character trait (habits), 
but possesses some similarities to them all. It is like dispositions, such as habits 
in that habitual employment of certain terms of attention becomes ingrained, 
and relevant sensitivites are heightened. This is the truth in saying sensibility is a 
skill. It is like seeing in that it involves modes of sensory attention in contrast to 
imagining, projecting into, or even seeing as. Full sensibility, in contrast to the 
limited form of sensibility to effects, is related to emotions, moods and emotive 
tones in that it is sensitivity to suggestions and evocations of the painting. 
Sensibility is related to ordinary sensibilities expressed in such idioms as “a sense 
of balance,” “sense of rightness,” “intuition,” “instinct,” “sense of distance,” 
“sense of danger,” “sense of space,” “sense of mass,” etc., etc. Sensibility is a 
covering term which can be best characterized as a group of refined perceptual 
abilities, though this is rather vacuous until the object to which one is sensitive is 
specified. Ordinary life equips us with and develops a wide range of sensibilities. 
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Of course, the nature and proficiency of a specific sensibility varies according to 
the kinds of habitual activities. In ordinary contexts also it is essential to note 
that the sensibilities come to verifying tests unlike those in the pure phenomenal 
world of painting. I have divorced sensibility from what was traditionally meant 
by taste, and have drawn sensibility closer to sensitive perception. This certainly 
can affect the way we think of critical judgments. I have tried to show that 
learning to see abstracts is not a metaphor, but is in some ways analogous to 
learning to see galaxies, though there are important differences. My theory 
places the critic’s judgments much more in the area of art historical statements 
of importance due to exploiting the medium, and the medium is now seen to 
include important parts of our background of normalcy. 


CONCLUSION 


The problem of contextualism, the problem of the intentional fallacy, and the 
problem of distinguishing projecting from sensing, that is, of distinguishing the 
projection of descriptions, characterizations, explanations, etc. into a painting 
from sensing what is actually present, are all problems of the same kind. 
Contextualism with regard, for example, to explaining a spatial effect is the 
problem of whether a known correlation in fact operates in a particular painting, 
whether something like texture is a cue to space in that painting. The intentional 
fallacy, as a version of the same problem, is expressed by statements by a painter 
who says “I intended to create an effect of space by the use of texture.” A 
further version of the same problem is a critic’s explanation “the space is due to 
the use of texture.” Appreciation is seeing a painting in certain terms. Sensi- 
bility, however, is a different matter, and we cannot evaluate with any confi- 
dence a characterization or explanation of a painting effect until we have 
developed our sensibility as a refined perceptual ability in the mode relevant to 
the verbal characterization or explanation. Once we are practiced at looking at 
texture as creating space, once we start to see and sense definite spaces (a 
gradual and difficult process of learning), we can with some confidence evaluate 
various verbal acts such as characterizations or explanations of painting effects. 

We can have no confidence in the nature, or even the existence of, Pollock’s 
space without being practiced in a visual act of some generality. The generality 
can be limited or extensive. We can sense his space in terms of other works of 
the same period, the corpus of Pollock’s works, works of other abstract ex- 
pressionists, works of late cubism, works of Cézanne and those influenced by 
Cézanne, etc. Sensibility must, however, be a visual act of some generality. 
Sensibility as perceptual refinement abilities, as skills and habits of seeing and 
sensing, cannot be one time occurrences. Every act of viewing when attention is 
completed by the uptake of noticing is an occurrence, as noted above, but 
sensibility is the stable core of acts of viewing, and it is this stable core of a 
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visual act of some generality by which we evaluate any characterization, explana- 
tion, etc. We should no more expect to evaluate verbal acts apart from sensibility 
than we should expect to be able to evaluate actions apart from more stable 
elements such as character traits, habits, moods, etc. Sensibility is the stable and 
developing core of specific acts of viewing. Analogously, we cannot characterize, 
describe, or even see details of a dim galaxy the first time we practice the act of 
averted vision. It is only by practice that we develop the relevant sort of 
perceptual refinement abilities. This completes the summary of my arguments 
for the first thesis mentioned in the introduction, namely, that we should not 
and cannot evaluate a critic’s or painter’s explanation, characterization, etc., 
except in the context of a developed sensibility. 

The second thesis can also be dealt with rather summarily, for the problem of 
factual information about vision as related to critical acts of characterizing or 
explaining is a part of the same problem of distinguishing sensing from project- 
ing a description, explanation, characterization, etc., of a painting effect. Factual 
knowledge is relevant at the level of general acts of sensing, not at the level of a 
specific viewing, for the same reasons given above. However, we can make these 
further significant, though limited, claims about the fruitfulness of the relation 
between investigations of vision and criticism. I expect that the most direct 
relevance between factual knowledge about vision and criticism will be at the 
level of critical explanations of effects, for we can reasonably expect that 
explanations of the form “space in Pollock is due to texture” will be both 
increasingly defined and, perhaps, even changed by increased knowledge about 
vision. One might counter by saying that all the visual causes of effects a painter 
and critic need to know have already been discovered. in the history of painting, 
and there is no doubt that painters in experimenting have discovered how to 
create a wide range of effects. However, even so simple and well known illusions 
as the Müller-Lyer illusion is not definitely understood in terms of its visual 
causes. There is no reason to doubt that when a definitive explanation of this 
illusion is forthcoming, it will have phenomenological relevance in articulating 
and even changing the cues we thought were causing the illusion. We will be able 
to articulate the cues we are now only seeing which cause the effect. A causal 
correlation can work without our being able to say what the cause is, though no 
doubt we are seeing something which is consistently causing the illusion. Eventu- 
ally, we shall be able to say exactly what the cue is which is causing the illusion, 
and this will increase our phenomenological awareness of exactly what we are 
seeing when we sense the illusion. 

I expect the same sort of increase in phenomenological awareness in criticism. 
There is reason to believe that facts about vision will increasingly enter the 
critic’s explanatory repetoire and that explanations such as “texture tends to 
create space” will be refined and even changed. This is not to say that science 
will replace criticism, for acts of sensibility are directed towards and developed 
by the history of painting, and the generation of effects in the painting context 
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do not respond to the same interests or cut the same way as illusion illustrations 
in psychological experiments. Acts of sensibility are queer on both the eyeside 
and objectside. However, I see no reason to doubt that knowledge about vision 
will articulate and even change what we thought were the visual causes of 
distinct effects in painting context. There are no doubt many visual causes of 
definite effects we can see but can not articulate. For all these reasons, one 
expects critical explanations will be especially enriched by factual knowledge 
about visual effects. 
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The Development of Thought: 
On Perceiving and Knowing 


Peter B. Pufall 
Smith College 


Piaget’s (1970b) thesis that “To know is to transform reality in order to 
understand how a certain state is brought about [p. 15]” seems straightforward. 
Yet the apparent simplicity of this statement quickly slips away as one attempts 
to examine the structure and function of these cognitive activities; moreover, it 
becomes particularly complex when one tries to reconstruct the genetic history 
of thought. For Piaget (1970a), knowing and its development are intimately 
linked to action, He presumes that thought is a biologically derived function, 
which not only structures the objects on which it acts but also restructures itself 
through its own activity. To draw out the biological analogy, any such struc- 
turing activity consists of two functions: assimilation, a function by which the 
object is incorporated into existing structures (that is, analogous to the digestive 
process by which food is assimilated into the blood stream) and accommodation, 
the function by which the existing structures are modified in accordance with 
the object to be known (that is, analogous to the change in composition of the 
blood as it accommodates the digested nutrients). 

One consequence of this action—based theory of cognition is the denial that 
perception is the efficient cause of knowing. Rather Piaget views perception as a 
process by which only figurative aspects of objects (color, shape, texture, 
etc.) are known. Thus, perception is specific, referring to concrete aspects of 
objects, rather than to general, abstract systems of activities by which symbolic 
representations of objects can be constructed, transformed or related to one 
another. The abstract system of logical and mathematical operations by which 
this is accomplished is characteristic of the highest level of conceptual activity, 
namely, operational knowing. 

Three implication arise from Piaget’s treatment of perception as a form of 
figurative rather than operative knowing: (1) since perception is an aspect of 
figurative knowing it develops only insofar as developmental changes in operative 
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functioning are reflected in figurative knowing; (2) as an aspect of figurative 
knowing it is incomplete, dealing with static figural states not invariants; and 
(3) it does not provide the basis of operative development and, in fact, if 
emphasized over structured transformations can yield distorted information. 
These implications become clearer in the preoperatory child’s understanding of 
quantitative invariance. He infers a relation from figural states (2) which distort 
the underlying quantitative relation (3). However, the choice of one dimension 
over another as an index of what “more” or “less” means is not given in the 
figural state but rather is an intuition reflecting operative functioning even at the 
preoperatory level (1). 

The distinctiveness of operative and figurative knowing is also exemplified in 
the development of the child’s ability to perceive and reconstruct geometric 
forms. 

Preschool children, about age 3, can discriminate among geometric forms but 
cannot reproduce them; all closed figures, circular or angular, are reproduced as 
closed circular figures. Thus, there is a developmental lag between perception 
and performance (Maccoby, 1968; Maccoby & Bee, 1965; Olson, 1970). The lag 
indicates that to perceive similarities and differences is not to know directly how 
they are reconstructed. The latter knowledge depends on the development of 
operative structures marked by the reversible relation of analysis and synthesis. 
These operative activities emerge slowly and are gradually elaborated into more 
comprehensive systems. Not until around age 4 does the child accurately copy 
a square, and this same square rotated 45° to look like a diamond cannot be 
drawn for one or two years after that. The relations intrinsic to the square do 
not change. The relation of that object to its field and to the child are changed, 
and, as a consequence, the particular action systems which have to be engaged to 
reproduce it are altered, That is, new operative structures have to be developed. 
(The recent work of Goodnow & Levine, 1973, on the grammar of action 
provides some provocative insights into the particulars of this developmental 
sequence.) 

Piaget’s analysis of the problem of graphic representation is often confused 
because of his use of the term schema, which is sometimes interpreted to mean 
image or icon (Gibson, 1966), and also because he stated that an image is an 
internalized activity (Piaget & Inhelder, 1969) which is often confused with a 
motor-copy position on perception. This confusion results from the fact that 
activity is often equated with movement per se, that is, the morphological 
pattern, and not understood as an abstract structure or scheme (Piaget, 1970a), 
which is generalizable and not particular. 

The same type of criticism arizes when memory is involved because Piaget 
holds that memory implies the organization information into schemes. These 
schemes themselves are not learned or memorized in the restricted sense used by 
psychogolists but rather they exist and develop by their very functioning. He 
explains memory in the restricted sense as “simply the figurative aspect of the 
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conservation of schemata (to be understood as schemes)” (Piaget, 1971, p. 187; 
author’s own brackets).! 

If operational knowing does not have its genesis in perception or figurative 
knowing in general, then how does Piaget explain the genesis of knowledge? 
Piaget (1967) puts it succinctly”. . . every structure presupposes a construction. 
All these constructions originate from prior structures... [p. 150] .” His posi- 
tion is stated more forcefully in the following passage: 

...each period of his (her) development partly explains the periods that follow. This is 

particularly clear in the case of the period where language is still absent. We call it the 

“sensorimotor” period because the infant lacks the symbolic function; that is, he does 

not have representations by which he can evoke persons or objects in their absence. In 

spite of this lack, mental development during the first eighteen months is particularly 
important, for it is during this time that the child constructs all the cognitive substruc- 
tures that will serve as a point of departure for his later perceptive and intellectual 

development . . . [Piaget & Inhelder, 1969, p. 31]. 


In short then every structure (act of knowing) generates from another structure. 
This position has been interpreted as implying either some form of preforma- 
tionism or a subjectivism. The former position is rejected on the grounds that 
operative structures develop only gradually and the latter on the grounds that 
structures emerge in a systematic fashion across individuals. Piaget does not deny 
that there is an obligatory relation between the environment and the construc- 
tion of thought. Every environmental intrusion precipitates a compensating or 
equilibrating activity on the part of the knower. Each equilibrating activity 
involves both an assimilative activity (a conserving function in thought) and an 
accommodative activity (a nonconserving function in thought). These reciprocal 
functions guarantee the balance between stability and openness to new informa- 
tion necessary for adaptive progress. 

A novel aspect of Piaget’s theory is that cognitive structures not only equili- 
brate to environmental intrusions but are also self-regulatory. By mere virtue of 
the fact that the structures function they become internally modified so as to 
form a more coherent system. They reconstruct themselves so as to achieve a 
new, higher level of equilibrium and thereby become increasingly more able to 
compensate for environmental intrusions. As will be noted, to achieve this higher 
level of equilibrium, the cognitive structures become relatively independent of 
particular content and, thus, more abstract. 

That cognitive structures function in a stable fasion fits comfortably into our 
understanding of homeostatic mechanisms. Consequently, each level of cognitive 


! This explanation seems to avoid the criticism leveled by Gibson. The mind is not 
characterized as a collection of images which mediate perception or memory. However, to 
the author’s knowledge there is no clear explanation of how a specific experience can be 
imagined or remembered. That is, if to imagine or remember is to reconstruct from 
operatory structures which are general there seems no way to decide what specific charac- 
teristics should be remembered, or how they are processed in the act of remembering. 
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development can be characterized as a set of homeostatic structures. This fact 
can be illustrated with respect to the preoperative period of thought where two 
sets of objects are equal in number if set in one-to-one correspondence. When 
the elements are displaced so as to destroy the perceptual correspondence, the 
two sets are judged to be unequal and the longer row to be numerically larger. At 
this level of development the homeostatic structure conserves quantity in terms 
of length relation. Conserving this structure does not, of course, conserve 
number at the level of thought. While at another level the child might judge in 
terms of length if the two sets differ dramatically in terms of length, he will now 
judge in terms of density if the lengths are comparable (Pufall & Shaw, 1972). 
This suggests two homeostatic functions which conserve themselves on the basis 
of different information. Finally, at the operative level the child judges that two 
sets remain equivalent even though perceptually different, indicating that the 
transformations and the perceptual variations they yield are compensated in 
thought. Thus, the developmental course seems to be marked first by a single 
structure which is equilibrated, followed by a transitional state of disequilibrium 
in which substructures are differentiated but not yet integrated. And, finally, a 
new state of equilibrium emerges in which these substructures are integrated. 
Hence, in this way, homeostasis is ultimately achieved at a higher level. 

Homeostatic functioning is conservative, in the sense that it tends to maintain 
the child at a given level of development. Consequently, since it accounts for 
functioning at a given level, it cannot account for changes in level of functioning 
as the child moves from one period to another. At best, the above analysis in 
terms of homeostatic processes only provides a descriptively adequate under- 
standing of development. But development itself implies a violation of homeo- 
stasis. Hence, to explain the transition between levels of development requires an 
augmentation of the homeostatic process by another process to account for 
change. Piaget (1971) calls this additional process, homeorhesis. Here the 
homeostatic process accounts for the continuities of development within sin- 
gle periods; the hemeorhetic process is needed to account for discontinuities in 
development exhibited between periods. Consequently, equilibration does not 
specify the ordered course of development; it only specifies that a structure will 
function and that it will respond to the object or event to be known. To 
supplement this mechanism, Piaget (1970a, 1971) proposes abstraction as a 
regulative mechanism of change. 

For Piaget there are two forms of abstraction, each of which is action based, 
but which differ in terms of the focus of abstraction. Abstraction from objects 
per se or physical abstraction is the discovery of properties of objects, whereas 
reflective abstraction is the discovery of structure within the actions per se. It is 
obvious that the latter abstractions yield logical-mathematical structures while 
the former yields an understanding of physical properties of objects. These 
physical properties vary in their degree of abstractness depending on the level at 
which they are known, that is, sensory—motor or operatory. It is also true, then, 
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that the nature of these abstract properties is going to reflect the functioning of 
logical-mathematical reasoning. Physical abstractions are physical not because 
they are directly perceived or known but because their reality is imputed to the 
object and not to actions on that object. Perhaps some examples reflecting the 
functioning of these forms of abstraction will make them clear. 

A sensory-motor understanding of ball might include an appreciation that it 
can be grasped, lifted, bounced, etc. While this knowledge is clearly action-based 
it is also obvious that it refers to properties of objects. They have a size to which 
the hand can accommodate or not, a weight which is within the child’s limits of 
strength or not, and a quality which makes it return when thrown against a hard 
surface. At the operational level of knowing these same properties might be 
known in a more abstract manner. Balls can vary in terms of their volume, mass, 
or elasticity. These higher level properties are discovered or abstracted through 
the child’s own actions; however, in this case the actions are operational. That is 
the actions are structured so that both observed and impled transformations are 
considered. 

A set of balls can also be logically structured, for example, transitively related. 
We know that if ball A is bigger than B, which in turn is bigger than C, then A is 
bigger than C. The transitive structure which makes this conclusion logically 
necessary is not absrtacted from these balls nor is the the structure imputed to 
be a property of that or any set of balls. Moreover, the transitive structure is 
abstracted from the child’s own actions of ordering but reflected to the level of 
operational reasoning by virtue of the fact that action is no longer sequential (A 
is followed by B which is followed by C) but is reversible (B can both be larger 
than C and smaller than A). 

Piaget’s position on cognitive development has been challenged. Some have 
argued that cognitive development is genetically based in perceptual processes. 
Others have argued that cognitive development is based in action but they have 
interpreted Piaget to mean that development is an internalization of particular 
movements not the abstraction from actions. Briefly, let me take each of these 
in turn. 

The provocative findings of Bryant (1974; Bryant & Trabasso, 1971) challenge 
not only when operatory schemes develop, but more significantly challenge the 
hypothesis that operatory schemes are abstracted from actions. Bryant (1974) 
proposed that young children can make transitive inferences based on perceptual 
information specified by an external framework. The implication of his position 
is that these frameworks are detected directly and that the inference is passive in 
the sense that the structuring framework does not have to be constructed by the 
child. Thus, given the appropriate perceptual conditions and minimizing the role 
of performance factors such as memory, inference is a necessary outcome. 

There are serious questions about these conclusions given the facts of the 
experiment. Why is the route to success so tortuous if the inference is based on 
information specified in the perceptual context? While memory for the contrast- 
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ing pairs seems to be implicated in the answer to this question, it seems to beg 
the question. Why is it such a difficult problem for the very young child to 
remember the relations within a pair? Perhaps the difficulties exist because the 
child does not have the available structures for organizing the information and, 
consequently, the sequence of relations is understood as arbitrary rather than 
necessary. There also seems to be good reason to question the perceptual nature 
of the inference. In one experiment the child never saw the lenths of the sticks 
but only saw the tops of the sticks. Yet they performed as well as those who had 
seen the sticks. Therefore, it seems implausible that the child necessarily 
“... compares different objects which he sees at different times through their 
own common relation to a constant background [p. 57].” Indeed, if this were 
the case children in the latter condition would be compelled to infer that all 
elements were equivalent. 

While there seems to be ample reason to doubt Bryant’s conclusions, there 
remains the need to explain the findings and, particularly, to see whether they 
are consistent with respect to Piaget’s position. Youniss and Furth (1973) 
suggest that the behavior of the children in Bryant’s studies does not necessarily 
implicate transitive inference. In an unpublished study they have shown that 
young children will impute a “transitive” relation to a sequence of events even 
though there is no logical reason to do so. By contrast, older children do not 
because they understand transitive logic. It seems reasonable to argue that 
younger children do not have an overgeneralized understanding of transitive 
logic but, instead, they are capable of ordering events into a sequence. Perhaps 
this ordering is similar to ordering a set of actions when making a plan. Once 
ordered one can make “inferences” about what subactivity comes before an- 
other subactivity but certainly that “inference” should not be confused with a 
cognitive activity implying a logical necessity. 

This argument is consistent with Piaget’s theory in two critical respects. One: 
it holds that even the pseudotransitivity behavior of young children is based in 
the active organization of information. Two: it suggests that logical transitivity is 
not at first passive nor based in figural knowing or perception. 

This is not to claim that operational transitivity in particular and logical- 
mathematical thought in general develop out of overlearned habitual systems 
such as that studied by Bryant. Nor is it to be taken as a validation of 
neobehavioristic positions such as that of Berlyne (1954). Berlyne’s interpreta- 
tion, or better his transcription of Piaget’s theory into behavioristic terms, 
attempted to capture cognitive development not as a process of abstraction but 
of internalization of action. He proposed that objective transformations ex- 
pressed in the child’s own manipulations or in his observations of other’s 
manipulations are internalized as transformational mediating reponses. Such a 
mediator, presumably, is acquired in the same way as other mediating events— 
through the internalization of motor activities. Internalization, in mediational 
theories, implies an isomorphic relation between the activities which are origi- 
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nally overt but become covert at some later point in time. This mediational 
account, however, encounters several serious problems in explaining how the 
internalization process is initiated. Only if some executive function were im- 
puted could internalization in the typical mediational sense be considered 
seriously. Then, of course, there is the attendent problem of explaining the 
development of the executive function and the development of its coordination 
with these action systems. 

It seems unlikely that Piaget intends that action-based knowing should be 
reduced to specific motor movements. The actual movements only serve an 
instrumental purpose by establishing correspondences, ordering, etc.,—in other 
words altering the figurative aspects of knowing (Furth, 1969; Piaget, 1969). 
Both reflective abstraction and physical abstraction imply the construction of 
structures with properties that are not contained in the motor actions them- 
selves. Moreover, abstraction implies the structuring of mental activities which 
go beyond those particular experiences from which the abstractions are derived. 
In a recent study (Pufall, 1973b), children could not predict the outcome of 
rotating a sequence through 180 and 360° were given experience with rotation 
through 135° and back again so that they could track the displacement of the 
objects. On subsequent testing, they correctly predicted the orderings after the 
180° and 360° rotation. Clearly the children had abstracted a conceptual system 
which assimilated rotational transformations beyond those immediately experi- 
enced in training. 

Before leaving this section and the question of mechanisms of development, it 
seems appropriate to reconsider abstraction in the light of the distinction 
between figurative and operative knowing. Operative structures are abstracted 
but figurative knowing participate in the process of abstraction insofar as it is 
the emperical link between abstract knowledge and the object to be known. The 
notion of elasticity is constrained by our figurative experience of a bouncing 
ball. Although numeric equivalence is abstracted actions of correspondence it is 
clear that establishing the figurative correspondence is a necessary experience if 
the operational scheme is to develop (Piaget, 1952; Pufall, Shaw, & Syrdal- 
Lasky, 1973). Moreover, it is from activities of setting up correspondences that 
the child discovers that figurative properties vary systematically with number. 
Thus, the child not only constructs an operational understanding (if nothing is 
added or subtracted number is conserved) but also this knowledge reflects back 
on figurative knowing in that the child recognizes the logical necessity of the 
numeric relation between two rows given their length-density relations. 


Conservation in Thought 


The compensatory property of thought which sets off preoperatory activity 


from operatory thought is its reversibility. Actions are structured as operations 


when one mental activity implies a reverse activity. For example, inversion or 
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negation is an aspect of reversibility and is manifest in rationalizations of 
conservation predicated on the fact that a transformation can be reversed so as 
to reproduce the original perceptual relations; and reciprocity is manifest in the 
child’s arguments from compensation of perceptual parameters, for example, 
length and density in the case of numeric equivalence. 

However, Piaget, (1967) refers to compensation in another way when discuss- 
ing equilibration. The more equilibrated thought becomes, the more able it is 
to compensate, in the sense of assimilation, information provided by an event as 
well as transformations relating events. This aspect of equilibration then is the 
shift of the knowing activity from its dependence on direct experience with the 
figurative object of knowing and its physical transformations to a knowing 
activity which assimilates an event to an equilibrated structure. 

In a recent study (Pufall, Shaw, & Butcher, in preparation), we provide a 
demonstration of this developmental trade-off between figuratively specified 
transformations and operative thought. We tested 4-, 5-, and 6-year-old children 
on a set of number tasks. On some of the tasks the transformation conserved the 
original equivalence (conservation task) while another set of transformations did 
not (nonconservation). Half of the children could observe the transformations 
and half could not. On the conservation task, the youngest children did not 
judge the sets to be equivalent even if a conserving transformation was observed; 
and, while there was improvement with increasing age, only one-third of the 
oldest children judged that the two sets were equal if they could not observe the 
transformation. On nonconservation tasks, the youngest children did not judge 
the numeric relation correctly, but the five year old children did judge that the 
sets were unequal after observing the transformation, even though in some cases 
length was equivalent. Hence, all of the older children (5- and 6-year-olds) can 
assimilate a transformation relating states and do not reason exclusively from the 
figurative properties which result from the transformation. 

This evidence makes it clear that even though late preoperatory thought 
reflects an understanding of transformations which change the relation from 
equivalence to nonequivalence, it is just as clear that thought is not completely 
structured so that as yet the child does not reflect on that knowledge. Consider, 
for example, that the child acknowledges the fact that in the case of noncon- 
servation problems addition and subtraction changes the numeric relation, and 
yet does not go on to reason in the case of conservation problems that because 
nothing has been added or subtracted the sets remain equivalent. The child’s 
understanding of transformation is restricted to the direct observation of it and 
does not involve, at that level, the reversible organization of transformational 
activities and the implication of this structuring for quantitative relations. 
Finally, only after about age 6 years does the child reason through the various 
relations among length, density, and number when presented with a figurative 
instance but no transformational history. At that level, operative structures are 
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equilibrated and follow laws of logical necessity, that is, the structures of 
thought are conserved in that they are completely reversible. 


Developmental Dependencies in Thought 


One of the implications of Piaget’s genetic theory of though is that we should be 
able to identify the forerunners of thought in the child’s actions at an earlier 
level; and, moreover, these actions or schemata should hold some structural 
relation to subsequent thought. Two methods have been used to analyze this 
implication—or, perhaps, to attack the theory. The first tries to induce change 
through principles of learning, for example, reinforcement or discrimination 
learning sets, while restricting cognitive experiences which specify transforma- 
tions or induce conceptual conflict. From this method it is reasoned that if 
change is induced then the theory of action based thought is incorrect and, 
moreover, that conceptual development follows extablished laws of learning. 
The second approach examines the theory by testing the predicted sequential 
dependencies among cognitive structures. 

Those adopting the first approach have offered various interpretations of 
“what” is discriminated when the child learns to conserve. For example, Braine 
and Shanks (1965) proposed that the child learns to discriminate “apparent” 
from “real” differences. This argument not only begs the question of what 
constitutes an understanding of the distinction between “apparent” from “real” 
but it is also inconsistent with some developmental facts. For example, some 
conservations are understood before the child understands the “real-apparent” 
distinction (Langer & Strauss, 1972). Another variation on the theme of dis- 
crimination learning is that conservation occurs when the child learns to attend 
to the “relevant” and to disregard the “irrelevant” dimension (among others, 
Gelman, 1969; Kingsley & Hall, 1967; Wallach, Wall, & Anderson, 1967). This 
orientation in turn holds that the concept, for example, numeric equivalence, is 
a figural property as are relations such as length and density. Therefore learning 
involves only learning to discriminate and attend to the appropriate figural 
property (number) in the appropriate context, that is, in response to verbal 
questions asking whether quantities are equal or not. Obviously this perspective 
is diametrically opposed to an action-based theory of cognition. j 

The success of the training procedures used in all of the above mentioned 
studies would seem to provide support for their theoretical position. While it is 
difficult to argue with success (or data) the approach seems in many cases 
tangential to Piaget’s theory and, perhaps, without any clear direction as to what 
is being studied. ae: 

Consider the concept of number equivalence and the implication of equating it 
with length and density as properties directly available in the figurative display 
of two linear sets. One implication is that it is a preformed concept which the 
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child does not demonstrate in the typical conservation experiment because of 
performance factors. For example, there might be semantic confusion due to the 
ambiguous comparative phrase “same as,” or there might be a memory problem 
as the child tries to integrate information over time. Through systematic dis- 
criminative training the child learns under what conditions he is to attend to 
number and under what conditions he is not being asked to attend to number. 
However, this approach does not explain why the cue for number is so difficult 
to access. Another implication is that through the education of attention 
number is detected as a distinctive property of any array composed of discrete 
elements (Odom, 1972). But again this approach does not specify the structure 
of the attended property, that is, what is number? It is this epistomological 
question that Piaget attempts to answer. Number is not to be reduced to a 
figural property, but is to be recognized as an abstraction. As a consequence, the 
perceptual properties of a set are not irrelevant but are compensated in thought. 
Because the perceptual properties are compensated, it becomes a logical neces- 
sity that if rows are equal in length but not in density then they are not equal 
numerically. If no element is added or taken away from each of two equal sets, 
then the sets remain equal; and a phenomenal index of this logical necessity is 
reflected in the look of incredulity expressed on the face of the operational child 
when asked if the sets remain equal. 

The second approach in which sequential dependencies are examined has 
yielded confusing results. One set of studies found ordered relations among 
cognitive activities (Bower & Wishart, 1972; Brainerd & Brainerd, 1972; Inhelder 
& Sinclair, 1969; Moynahan & Glick, 1972; Murray & Youniss, 1968; Schloss- 
berg & Pufall, 1974; Youniss & Murray, 1970); a second set of studies found the 
reverse relation (Gelman & Weinberg, 1972); while still a third set found no 
relation at all (Dodwell, 1962; Schlossberg & Pufall, 1974). Although all of these 
studies seem to tap different cognitive activities, there is no apparent trend in 
any of the three sets of studies to explain why some predictions hold and others 
do not. In spite of this confusion this attack seems the most valid and fruitful 
approach to our investigations of the developmental hierarchies of thought and 
their developmental relations. 

It might be especially fruitful to combine this approach with one that includes 
training experiences. Inhelder and Sinclair (1969) point out that some concep- 
tual areas are precursors of areas yet to emerge. The psychological dependency 
can be tested by first identifying children expressing or not expressing the 
precursors and then giving them training. While negative results would not 
necessarily refute the dependency being examined, positive results would take us 
beyond the descriptive level. Some very exciting experimentation on the stage 
sequence in object concept (Bower & Wishart, 1972) and compensation reason- 
ing in conservation (Curcio, Kattef, Levine, & Robbins, 1972) has been done. 
Moreover, such a procedure combining training with sequential dependencies 
analysis might help clarify ambiguities in the results. Schlossberg and Pufall 
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(1974) identified a number of children who succeeded on one type of length 
task but did not conserve number while others conserved number but did not 
succeed on any length tasks. This observation suggests no sequential relation 
among these conceptual areas. However, after training on length and number 
tasks the former children no longer succeeded on the length task while the latter 
group learned to conserve number but did not improve on the length tasks. 
Apparently rather than no relation, we discovered an even more complicated 
pattern of development of cognitive structures. 


THE CONSTRUCTION OF THOUGHT 


In the first part of this chapter I have attempted to outline the self-regulated 
aspects of intellectual growth and at the same time indicate the functional 
relation between figurative thought which is constrained by the structure of 
information in the world and operative thought which is governed by laws of 
logical necessity. In the following section the thesis that thought constructs 
reality will be examined. 


Construction of Object Permanency 


Piaget’s (1954) masterful work on the child’s understanding of an objects 
permanency has been replicated in whole or in part by numerous researchers 
(Gratch & Landers, 1971; Landers, 1971). Even Bower (1967, 1971), whose 
early work seemed to challenge the facts of object permanency, has in recent 
publications validated the stages of development (Bower & Paterson, 1972; 
Bower & Wishart, 1972). Yet Piaget’s interpretation of these findings continues 
to be criticized, especially by Gibson (1966). Piaget contends that an object’s 
continued existence, is specified by the structure of intellectual activities; hence, 
that it is conserved in thought. However, Gibson (Gibson, Kaplan, Reynolds, & 
Wheeler, 1969) points out that the occluding transformation which Piaget 
employs in the object permanence task does not specify its destruction but only 
its occlusion and, therefore, the continued existence of an object. In contrast to 
Piaget, Gibson contends that the invariance of the object is specified in the 
information available in the transformations in the optic array; hence, its 
conservation does not depend on intellectual activity but rather on the nature of 
information in the world. 

A final resolution of differences between Gibson and Piaget seems a goal too 
ambitious for this chapter. What can be accomplished is a clarification of each 
theorist’s position with respect to the phenomenon of object permanence. Both 
agree that information processing is active. For Piaget this processing involves 
the activity of assimilating information to intellectual structures, while for 
Gibson it involves a perceptual activity by which environmental information is 
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differentiated. However, both agree that information is not derived from sensa- 
tions nor characterized as a system of internal registries which contain images or 
copies of sensory events. 

They differ in terms of what constitutes perception. For Piaget (1969) percep- 
tion is descrete, while for Gibson (1966) it takes place over time. A consequence 
of Piaget’s position is that it carries with it the conclusion that perceptions are 
not accurate reflections of the world, that perception is subject to distortions 
which must be compensated for by intellectual activities. This position receives 
its empirical support from experiments which manipulate expectations or sets, 
from perceptual experiments where the quality of information is poor or the 
time “to act” upon it is restricted, and from studies of perceptual illusions 
(Piaget, 1969). 

Piaget also attributes the failure to conserve quantity to the unreliability of 
perceptual or figurative information. However, the unreliability of information 
in this latter case does not relate to whether or not the child perceives the level 
of water as it really is in a liquid quantity conservation problem or to whether 
the child fails to discriminate between the acts of pouring which add or subtract 
something from a quantity. Piaget does not suggest that there is enhancement or 
filling-in of information. He proposes that the knowledge that one can have 
about the world is more than a mere perception of invariance. 

According to Piaget intellectual activities structure these perceptions, while 
according to Gibson the world is structured and, therefore, intellectual medi- 
ating activities are not required. However, for Piaget, to construct this reality is 
to reconstruct it at the level of knowing. For the preoperatory child this 
reconstruction is reflected in his practical organizing of displacements of objects 
in space, and eventually in his symbolic representations of space. For the formal 
operational person this reconstruction might be a psychological theory of 
reality, for example, ecological optics. 

In sum, while an occluding transformation specifies the continued existence of 
the occluded object, the child has to reconstitute that information in the act of 
knowing before he can practically deal with this perceptual invariant (Piaget, 
1969). The reality constructed, moreover, is always constrained by the structure 
of the world reflected in figurative thought. As noted previously, it seems that 
this constraint is imperative to the construction of a structured system of 
thought. It probably should go without saying, but, if this constraint were not 
available, then the structure of knowing would either be completely subjective 
or perhaps nonexistent. 


Construction of Spatial Reference Systems 


The development of object permanence allows the child to deal with spatial 
organization at the level of practical knowing. Yet to develop is the capacity to 
represent spatial organization. This latter capacity has been investigated by 
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examining children’s map-reading skills (Pick, 1972), perspective taking (Flavell, 
Botkin, Fry, Wright, & Jarvis, 1968; Laurendeau & Pinard, 1970; Piaget & 
Inhelder, 1969; Pufall & Shaw, 1973; Shantz & Watson, 1970, 1971), and 
mental manipulation of spatial arrays (Huttenlocher & Presson, 1973) among 
others. In general, these abilities imply the operation of a reference system, that 
is, an abstract system within which aspects of space can be organized and by 
which transformations yielding new perspectives can be coordinated. Piaget and 
his co-workers (Piaget & Inhelder, 1956; Piaget, Inhelder, & Szeminska, 1960) 
have documented the development of reference systems vis-a-vis geometric 
systems, finding that the child’s first representations of space preserve topologi- 
cal properties but do not conserve either projective relations or Euclidean values. 
Not until the onset of operational thought in early grade school do the latter 
two referential systems begin to manifest themselves in the child’s spatial 
thinking. This developmental sequence has been replicated by other researchers 
(Laurendeau & Pinard, 1970, and Pufall & Shaw, 1973). A recent study by Coie, 
Costanzio, and Farnhill (1973) has explored the development of specific aspects 
of projective relations. 

Piaget and Inhelder (1956) also offer another scheme within which to under- 
stand the development of reference systems. They reason that the evolution of 
any reference system is predicated on the functioning of infralogical? structures, 
primarily the multiplication of spatial relations, which assimilate the above 
referential systems. The change in cognitive structures around the first year of 
grade school is marked by a shift from thought which conserves binary relations 
to thought which conserves relations in a multiplicative scheme (Laurendeau & 
Pinard, 1970; Piaget & Inhelder, 1956; Pufall & Shaw, 1973). 

The above analyses of spatial thinking capture the formal characteristics of 
knowing space, that is, the operative structures. They do not analyze the content 
of our spatial knowledge. The content can be specified at two levels. First, there 
is the frame of reference, which determines the nature of the axes coordinated in 
thought (Attneave & Reid, 1968; Rock, 1956; Rock & Heimer, 1957). One 
frame of reference is egocentric and defines the axes as near-far, left-right, and 
headward-footward. Another is defined in terms of axes intrinsic to the physical 
space, such as horizontals and verticals, as well as projections running at right 
angles to these two axes. A third is specified as the gravitational frame of 
reference specified by the force of gravity, detected by the proprioceptive 
systems, and the horizon. While each of these reference frames is phenomenally 
available (Attneave, & Reid, 1968), it is more likely that the adult will opt to 
organize space within a physical reference frame (Rock, 1956; and Rock & 


2Piaget and Inhelder (1956) make a distinction between logic operations and infralogical 
operations. At a formal level they have equivalent properties; however, infralogical opera- 
tions conserve the object of knowing through symbolic representations, for example, 
images, while logical operations conserve collections in terms of classes and numerical sets 
which are known independent of their figurative quality. 
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Heimer, 1957). This preference for a physical frame of reference appears to 
develop and has been explained within organismic theory as the outcome of the 
increasing tendency to differentiate self from world (Glick, 1964; Glick & 
Wagner, 1966; MacFarland, Wagner, & Werner, 1962). 

At a second level of analysis, content refers to perceptual features of any 
space. These features range from surface qualities of color, form and texture to 
more abstract relational features, such as edge and corner, and relational proper- 
ties, such as central versus peripheral, medial versus lateral. The surface features 
probably function as perceptual anchors for reference systems even during the 
preschool years; when the child conserves the proximity between a position and 
a distinctive feature he probably differentiates among the positions about that 
feature in terms of an egocentric system, such as “the ball is to the left of the 
box” (Pufall & Shaw, 1973.) The problem is that at this level the child cannot 
coordinate objects within a complete system nor can he coordinate objects 
independent of self. The more abstract properties undoubtedly have to be 
learned, perhaps through a process of differentiation (Gibson, 1966), and some 
appear to emerge as functional properties after the child has already acquired 
infralogical operations to coordinate spatial content of at least a two-dimen- 
sional space (Pufall, 1973b). 

Spatial thinking intimately involves operative knowing in the form of infralogi- 
cal structures and figurative knowing insofar as what is known always refers back 
to an act of perception, that is, image. Moreover, between these poles of 
preoperational thought is the self as the center of functional activity (Piaget, 
1970a). Numerous studies indicate the continued functioning of egocentricity in 
spatial thinking as adults (Huttenlocher & Presson, 1973; Pufall & Shaw, 1973). 
This is due to the simple fact that our knowing of space at any moment is 
coordinated through a single perspective. This fact in turn suggests that the 
genesis of operational reference systems is in the actions of self in the perceptual 
world. 

Even as young as two years of age, children understand that another person 
can perceive an object even though they cannot see it directly (Masanghay, 
McCluskey, Sims-Knight, & Flavell, mimeo); and Shantz and Watson (1970, 
1971) have shown that three-year-old children expect the projective relations to 
self to have changed if they assume a new perspective. The child’s spatial 
thinking is not dominated by egocentrism if he changes perspective through 
movement or if directed to acknowledge that a person can have a perspective 
different than his own. However, this same child does not possess a reference 
system within which he can transform perspectives and, therefore, represent 
accurately the other person’s perspective. 

These facts alone then do not provide an insight into the development of 
reference systems. A study by Pufall, Megaw, and Aschkenasy (1974) examined 
the ability of four-year-old children to locate an object in space after they had 
altered their perspective on that space by walking around it. Their performance 
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on this type of task was compared to their performance on a duplicating 
location task in which the identical spaces were set side by side but differing in 
orientation. In both conditions the child had to relate what was seen from one 
perspective to that of another; in the case of the walk-around task the child had 
to remember the location as perspective was changed, while no memory factor 
was involved in the second condition. Four-year-old children made significantly 
fewer errors and no egocentric errors when perspective change was contingent on 
their own activity. 

Huttenlocher and Presson (1973) advance the hypothesis that perspective 
change contingent on one’s own activity might be easier because the relation of 
all or part of the space to self is tracked directly or symbolically (imaginally) as 
one moves about the space. Although the data are only in preliminary stages of 
analysis, Jane Megaw’s thesis work using a covered space weakens the argument 
for an hypothetical mechanism which tracks a directly perceived space; and it 
seems unlikely that children as young as four years of age are tracking a 
representation of space. That is, it seems doubeful that the child imagines a 
space and then tracks his change in perspective as he moves about that space. 

Perhaps the ease of conserving spatial relations when one moves through space 
is explained best by Gibson’s (1966) conjecture that perspectives are “con- 
tained” in the act of perceiving; hence, as one moves through space the primary 
fact of perception is that space is rigid which implies that the projective relations 
among objects populating the space are constant. At the same time it is a fact 
that the projective relations to self are variant. It seems reasonable to postulate 
that it is this variance that is due to the activity of the perceiver which would 
motivate “exploratory activities” in search of higher-order invariances (Gibson, 
1966). 

This analysis can be recast in Piagetian terms. As we move through the world 
our activity generates changes with respect to what content is to the left or right, 
or near or far. However, our physical frame of reference remains invariant. At 
the preoperatory level, the conflict between the two might motivate abstracting 
a reference system by which each of these frames of reference can be transcribed 


into the other, Such a reference system would coordinate perspectives so that an 
uld be understood to be relative but, perhaps more 
ate exhaustively those 
f reference from those 


egocentric perspective wo) 
importantly, would eventually allow us to differenti 
spatial relations specified within an egocentric frame O 
specified within a physical frame of reference. 

Our investigations of spatial thinking help to clarify the developmental rela- 
tions among egocentrism, operative thought, and figurative aspects of space. The 
basic technique has been to present the child with two identical spaces set side 
by side. These spaces can either be oriented the same way or one can be rotated, 
usually 180°. If the child is organizing space egocentrically, then it should lead 
to predictable errors in the rotated condition when she tries to place an animal 
on her space, which duplicates the experimenter’s placement on the other space. 
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The first study was not developmental; only kindergarteners were tested 
(Pufall, 1975). As was expected theses children made a high percentage of errors 
when the child’s space was rotated 180°, the vast majority of these were 
egocentric. However, when the child’s space was rotated 90° she made fewer 
errors and almost no egocentric errors. The reduction in egocentrism does not 
seem to be related to the degree of rotation per se as much as to the effects of 
the two rotations. The spaces were simulated farm scenes with a road running 
through the middle near to far and a stream running through the middle left to 
right. When one is rotated 180° these aspects are realigned with each other and 
with the child’s egocentric system. When one is rotated 90°, the stream of one is 
in the same axial orientation as the road in the other. As a consequence an 
egocentric attitude would change the feature next to which the object was 
placed. Additionally, the animal’s orientation would have to be shifted from 
parallel to a right angle to a feature. A more recent study (Pufall, Megaw, & 
Aschkenasy, 1974) yields results consistent with these assumptions. The spaces 
were circular and the objects on them were inside an isosceles triangle. Rotation 
did not produce any figurative congruence between spaces nor did it produce 
similarities with respect to an egocentric frame of reference. Although errors 
were high, the percentage that were egocentric never reached the same magni- 
tude as those found with rectangular space. 

Apparently, if the two spaces are figuratively similar, then the children 
organize space egocentrically but not when they are dissimilar. Perhaps in both 
cases they approach the spaces egocentrically and alter that attitude only if the 
placement yields perceptual discrepancies. When these discrepancies are detected 
the child shifts to a “world” orientation; and while not always correct in their 
placements, they infrequently function egocentrically. 

In another set of studies we presented children, 6 and 10 years of age, with 
space divided into quadrants with pegs in each corner of each quadrant. Some 
children had colored pegs and performed very well, essentially making no errors 
of location (Pufall & Shaw, 1973). Other children had white pegs in the corners 
of each quadrant (Pufall, 1973b). In the latter condition, the younger children, 
in contrast to the older children, made many errors, and the majority of them 
were egocentric. They appeared to organize the contents of quadrants ego- 
centrically. For instance, if the experimenter’s animal was located on the near 
side of the far right hand peg, the child would put hers on the near side of the 
far-right peg, irrespective of quadrant. Even in making their errors the 6-year-old 
children indicated that they were structuring space, logically multiplying the 
dimensions of near-far and left-right. Their intellectual functioning reflects 
concrete operational reasoning. As long as there is perceptually distinct informa- 
tion to be organized, they can do and organize it. But in its absence they resort 
to an egocentric orientation. 

Our final experimental condition has yielded interesting information about the 
child’s capacity to imagine the rotation of planar spaces. We showed the children 
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two completely symmetrical spaces which when rotated looked the same as 
before rotation. Then we placed an animal on our space and asked the child to 
imagine what it would look like when rotated 180° and to indicate his answer on 
his display. Those children who had experienced the spaces with colored pegs in 
each quadrant (Pufall & Shaw, 1973) made numerous errors, the older children 
being more likely to make egocentric errors than the younger children. From 
their behavior, turning head and trunk and pointing their finger so as to describe 
a circle, it appeared that they were processing the transformation in a way 
similar to the adults in Shepard and Metzler’s (1971) study. They seemed to be 
imagining a unit; for the 10-year-old children the unit appeared to be the 
content of the quadrant moving through space which they did not relate to the 
whole space. This led to some odd placements as the child would change a 
location from the center to an outside corner. Those 10-year-old children who 
had been previously tested with white pegs did not show many errors. Although 
their activities were not systematically recorded, they seemed to be less likely to 
try to twist and turn to achieve a new perspective. Rather, they seemed to be 
using another type of transformation, that is, inverting the relations such as 
near-far while at the same time maintaining the correct relation of the object to 
abstract features of the space, such as the midline vertical. The older children 
might not differ from the younger in terms of operative structures, such as 
logical multiplication, as much as they differ from them in terms of the abstract 
relations organized within these structuring activities. 


A Model of Spatial Thought 


A model might serve the purpose of summarizing the data and speculation 
about children’s spatial thought. The model consists of three assumptions about 
the relation of self to world: 


1. The primary reference frame is self and has to be so by virtue of the act of 
knowing and preceiving. 

2. The world is assumed to be invariant, that is, rigid. 

3. We shift from an egocentric frame of reference to a physical frame if the 
egocentric frame of reference generated relations substantially inconsistent with 
the physical frame. The model also consists of three assumptions about develop- 
ment. 

4. The development of cognitive reference structures is due to activities, on 
relations intrinsic to the world, perceptual activities a la Piaget and Inhelder 
(1956) or exploratory activities ala Gibson (1966). 

5. These are enhanced by self-world discrepancies. 

6. Even after the child has developed abstract referencing systems, develop- 
ment occurs in terms of detecting abstract geometric invariants. 
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INTRODUCTION: WHY PSYCHOLOGY MUST USE 
MECHANICAL MODELS OF THE MIND 


It is impossible to be absolutely certain that a theory of any sort is ever true. For 
example, one theory that many of us believe strongly is that no one is capable of 
moving objects just by thinking about them (psychokinesis). However one is sure 
of this theory, there is still the barest logical possibility that everyone but myself 
is capable of psychokinesis but for reasons unknown have formed a global 
conspiracy to hide this fact from me. Indeed, the only thing one can “know” for 
sure are the theories or fantasies one chooses to believe on the basis of evidence 
deemed both appropriate and secure. Thus, scientific skepticism is merely a 
healthy confession that we are all vulnerable to error and, therefore, should 
remain open to all reasonable arguments. 

Consequently, in this chapter I wish to share with you my beliefs and theories 
regarding the nature of cognitive growth and change. Two major points will be 
made: First, all such theories should be publicly expressible in the clearest and 
most concise form possible. Second, the best possible form such theories can 
take to meet the first requirement is to be mechanistic. 

But must psychology ultimately rely upon mechanistic theories of mind, that 
is, theories expressed in terms of explicit functions for which there exist 
procedures by which the value of such functions may be precisely computed, 
One obvious reason for an affirmative answer is that by their very nature such 
mechanistic theories can be shared publicly since they are formally precise rather 
than just intuitive. Let us explore this argument for a moment. 

Two kinds of theories are possible: mechanistic (formal) ones which can 
always be made public and intuitive ones which usually cannot be and hence 
necessarily remain private beliefs. Mechanistic theories, unlike intuitive ones, can 
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be reliably communicated to other people because there exists a convention by 
which definite conclusions can be drawn from the theory by anyone appro- 
priately tutored. Such theories can be reduced to finite descriptions of mechani- 
cal procedures for calculating the answers to the various questions that might be 
posed. Furthermore, these theories are so simple and mechanical that one can 
safely communicate them to even the most naive and obstinate colleagues. On 
the other hand, private theories cannot so communicate although one may value 
them highly. From such intuitive theories, however, one may fashion public 
theories that can be safely broadcast without fear of futile argument over mis- 
understandings and without fear of embarrassment from friends who profess 
to accept them but cannot agree about their implications. 

Thus, a mechanical view of the mind (and the universe) is imposed on us by 
the social desire to confirm our ideas with others. Even if something is perfectly 
understood, in a nonmechanical way, say by intuition or revelation, and all 
answers to all questions could be given correctly, one would still be unhappy 
with the theory until it could be understood by others. It is at this point that 
mechanism is essential. 

With this understanding of the reasons for using mechanisms (e.g., algorithms, 
computations) for scientific theories or models, I would like to proceed to a 
discussion of three classes of mechanism that people have been able to use. 


THREE CLASSES OF MECHANISM OR PUBLIC THEORY: 
TABLE, STATE, AND GROWING MACHINES 


Class I: Table Machines 


The simplest kind of theory that can be transmitted to someone is an exhaustive 
list of all possible conditions (or perhaps all questions that might be posed for 
the theory) together with a description of what happens as a result of each 
condition. Abstractly the set of conditions about which the theory has some- 
thing to say might be represented as an input alphabet. For each symbol in the 
input alphabet there is one output symbol. The correspondence of an output 
symbol for each input symbol is simply memorized or listed in a table. A table 
theory or machine might be schematized as shown in Fig. 1. This machine would 
operate by receiving an input symbol and emitting an output symbol as directed 
by its table of input-output pairs. Such a theory, to my mind, is about as 
predictable, rigid, and mechanical as you can get, and still be capable of 
producing psychologically nontrivial behavior. 

To be communicated to other people, the description of a table machine must 
be finite. A complete description would define an alphabet of the symbols used 
and an operation or a list of all input-output pairs in the machine’s table. Of 
course the alphabet of symbols might be very large and the operation performed 
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OPERATION TABLE 
INPUT OUTPUT 


INPUT SYMBOL OUTPUT SYMBOL 


> 


FIG.1. A table machine, the simplest kind of public theory or raw data. 


may be very complex (a different output for each input). A table theory could 
be very big and sophisticated if only humans were patient enough to use and 
communicate it. For example, we might let each word in the English language be 
a different symbol in the alphabet. Or we might let each pair of words be a single 
symbol. With an alphabet having a separate symbol for each string of 500 or less 
English words, we could construct a table theory that described with an output 
of “yes” or “no” all grammatical sentences of interest to almost anybody. There 
is absolutely nothing wrong with such a theory of grammar except that humans 
cannot deal with it. The number of symbols and the table it uses are too big. For 
the same reason, table theories are inadequate for describing organisms that grow 
or learn or change their behavior with time. Since a table does not change, it will 
always give the same output each time it receives the same input. If we want a 
theory for something which has input-output behavior that may change in time, 
then the time must be included as part of the input symbol so that different 
parts of the table will be used for different times. Even worse, an organism’s 
behavior depends not only on its input and the time, but on the whole past 
history of the organism, so the organism’s history must be part of the input 
symbol for a table theory. Fortunately there are other classes of mechanism that 
deal more satisfactorily (economically) with time changes of input-output 
behavior. 


Class Il: State Machines 


A table machine can be made into a state machine by adding a variable which is 
used to save at least part of the most recent output symbol and then using this 
stored output as part of the next input symbol as suggested in Fig. 2. State 
machines are usually defined so as to have a set of internal state symbols. In such 
a case the input to the operation table is assumed to be a pair of symbols; one 
from the environment (the real input symbol) and one from the machine itself 
(the feedback state symbol). Similarly on the output side of the operation table 
there are pairs of symbols; one is output to the environment and the other is 
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OPERATION TABLE 


OUTPUT 
VARIABLE 
STORAGE 
AREA 


INPUT SYMBOL I OUTPUT SYMBOL O 


FIG. 2. A state machine, a more powerful and common kind of public theory. 


sent back (or retained) in the machine. Just as we were willing to let two words 
be represented by a single symbol, we now might let an input and state symbol 
pair be represented by a single symbol or at least a single entry in the one 
operation table. 

On the other hand it is often more convenient to break the state machine into 
parts. For example a separate table might be given for the real output symbols 
and another table for the internal state symbols as in Fig. 3. This decomposition 
into two pieces results in more tables, but the tables might be simpler. For 
example if there were S different internal states and O different outputs, then 
there might be as many as S times O different pairs in the right-hand side of the 
table in Fig. 2. But the total number of different symbols appearing in the 
right-hand side of both tables in Fig. 3 would be at most S plus O, which is less. 
There would be repeats of the same symbol on the right in Fig. 3. Because of 
these repeats it may be possible to eliminate some of them and shorten the table 
to gain some descriptive economy. For example, if for a particular state symbol 
appearing on the left-hand side of a table, no matter what input symbol it was 
paired with, we read off the same thing on the right side of the table, then we 
would replace all such entries with a single entry in the table. This entry might 
have a blank where the input symbol on the left would have been, to indicate 
that no matter what input, this entry is used. Even further, it may be possible to 
decompose the machine more by using several state variables instead of one. 
Eventually we might end up with a large number of variables, each with its own 
relatively small operation tables as suggested in Fig. 4. 

Machine decompositon (which might also be applied to table machines) is a 
complex problem. For a simple example, I remember that it has something to do 
with the reason a hundred numbers can be easily represented with only ten 
symbols and two decimal place variables. Even more, it has to do with decimal 
arithmetic being easy because the one’s place is relatively independent of the 
ten’s place. Machine decomposition consists of discovering or inventing a repre- 
sentation of a machine where the variables are relatively independent of each other. 


OPERATION TABLE 


OUTPUT 
VARIABLE 


STATE 
VARIABLE 


sed state machine with the entries in the operation table for the 


FIG. 3. A partly decompo: 
ate does not depend on the input. 


state variable showing some cases where the st 


FIG. 4. A thoroughly decomposed state machine. 
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It is this relative independence that allows descriptive economy. Also, relative 
to one variable, another may seem invariant; such a relative invariant in a large 
table may also be described as a symmetry. The discovery of these relative 
invariances is a form of induction. 

A few points should be stressed. First, every decomposed machine, as in Fig. 4, 
still behaves exactly the same as some single table machines, as in Fig. 2. It is not 
true that a larger number of variables and a more complex structure of intercon- 
nections make possible the expression of a theory that couldn’t have been 
expressed by a single variable. All the structure and interdependence that is 
made explicit in a decomposition was implicit in some one large operation table. 
Still, we might prefer the more complex looking decomposition because it makes 
the structure more apparent, and further because it may afford an economy of 
description and communcation. Not only might it be a shorter description, but it 
may also be that the colleague to whom the description is being given, already 
knows some of the tables from other theories. 

Relative to a table machine, the chief advantage of a state machine is its 
capacity for storing some internal information or resetting internal conditions so 
that it can respond differently to the same input depending on its internal state. 
For example an organism may respond to a stimulus and then change the value 
of some internal association strength so that it later responds differently to the 
same stimulus. Further, the internal settings may depend on a long sequence of 
past events, and not just on the most recent input. 

The chief disadvantage of a state machine is that it must be initially set up to 
handle not only what an organism learns, but also everything it might have 
learned but did not. For example, a state machine to describe the human 
acquisition of a language must have all the states and all the operating tables that 
describe how the states change for every language a person might learn and for 
every possible order in which each language might be acquired. Describing the 
acquisition of language may seem much too much to expect from any theory, 
public or private. But even for very simple things, state machines are inadequate. 
For example, it is not possible to set up a state machine that can be used as a 
theory of such a simple language as arithmetic. This is because arithmetic has 
indefinitely large numbers, so no finite communicable state (or table) machine 
can even list all the numbers. In our every day lives we get around this limitation 
by memorizing some finite tables for multiplication and addition of the ten 
digits, and then using these tables on indefinitely large numbers which we 
construct by tacking on more decimal places. We let our numbers grow by 
adding on a new variable in a certain place, as needed. 


Class Ill: Growing Machines 


A state machine can be made into a growing machine simply by having some rule 
for adding a new variable to the network of already existing variables. (In an 
equivalent undecomposed machine this would be the same as creating new 
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symbols and enlarging the operation table.) Such a new variable is suggested by 
the circled unit in Fig. 4. Rather than simply changing the value of some 
variables, a growth process actually changes a state machine into a different state 
machine. 

There are three important things to be said in favor of growing machines. First, 
they can do things that no table or state machine can do, simply by virtue of 
their ability to grow larger than any one state machine. For example a growing 
machine can at least count as high as anyone wants, just by growing a string of 
decimal-place variables. Second, growing machines provide for two different 
kinds of change. One is the change in the value of the variables as described by 
operations and the other is the change in structure described by the growth 
process. Thus there is a real qualitative difference in functioning and growth, 
Too often our theories of growth (and learning) have merely been theories of 
adaptation where a few variables only change value more gradually than others. 
Growing machines provide a way to separate growth and adpatation con- 
ceptually. Third, growing machines are superior because they offer new forms of 
descriptive and communicative economy. Describing growth as the addition of a 
variable to a decomposed machine (instead of the addition of symbols and the 
expansion of the operation table) allows the machine to grow very fast. Adding 
one variable that takes on ten values is equivalent to multiplying the size of the 
operation table and the alphabet of symbols by a factor of ten. Also, instead of 
having to describe first off a very large state machine, it may be possible to 
describe only a very small one together with a simple rule or growth process for 
gradually enlarging it. 

How simply can the growth process be described? We need only specify three 
things which are; when the new variable is added, what the new variable’s 
operation table is, and how the new variable is linked to the old ones. This might 
easily be done by saying that the new variable is to be added whenever a certain 
old variable takes on a certain value, that the new variable always has a certain 
specified operation table, and that the new variable is linked to old variables that 
at the time have taken on certain values. (Some care must be taken that the old 
variables can be made to depend on the new one, there must be so to speak 
unused sockets on their left sides for the new variable to plug into.) 

A well-known result by A. N. Turing and others indicates that it is possible to 
specify very simple growing machines that are powerful enough for one of them 
to be able to imitate any other machine of any sort whatsoever that anyone has 
been able to think of (Arbib, 1969; Minsky, 1967). Put another way, the 
fundamental nonresult of twentieth century logic has been that every attempt to 
come up with a new class of machines (or a new method of computation) has 
ended up being a particular variation of growing machines and equivalent in 
capacities to one of Turing’s Universal Tape Machines. For much further detail, I 
refer the reader to a text on automata theory (the ones by Minsky, 1967, or 
Arbib, 1969, are good). I only want to make two points here. First, it seems very 
likely that growing machines of one sort or another are the most powerful and 
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comprehensive kind of public theory we can have—although no one has been 
able to prove this, it just turns out again and again. And second, the kinds of 
growing machines that have been developed so far have been designed primarily 
to make it easy to prove certain things about what they can do. As a result their 
growth processes and the interconnections of their variables are kept very 
simple, so while it is possible to prove that they can do just about anything, it is 
clear that it takes a lot of specifying to get them to do any particular thing. The 
problem I would pose is; what kind of growing machine is complex enough and 
powerful enough to serve with descriptive economy as a convenient public 
theory for psychology? 


Recapitulation 


By way of recapitulation I will set out a little mnemonic I use to remember the 
important concepts involved in a growing machine. A growing machine consists 
of a set of constructs: 


(V, A, L, O, G) 


where V is a finite set of variables which at any one time are assigned certain values 
or symbols; A is a finite alphabet or set of symbols which are assigned to (or stored 
in) the variables; L is a finite set of links or interconnections that define how the 
variables depend on each other (this is the structure of the machine); O is a finite 
set of operations or tables of finite size which define how the variables change 
from one symbol to another (this is the functioning of the machine); G is a finite 
set of growth processes which describe exactly when new variables are added, 
how they are linked to old ones, and what their operation tables must be. 

Of course, to be a public theory a growing machine must have a fairly small and 
complete description, and further a description which is so simple and explicit 
that even my most obstinate and simple minded colleague (a computer?) can 
agree to understand it. 

What I have described here is obviously not itself a public theory. I have not 
been able to come up with a general method for precisely defining growing 
machines that satisfies me. The trouble has been that different ways of de- 
scribing a growing machine make it easy to do or understand different things. 
What I am looking for is a general way of describing them that will make it easy 
to construct machines that will stand as public theories or models of human 
intelligence. That is no simple task so I am unembarrassed to admit failure. 
Should I come up with something interesting, rest assured PI] let people know. 

In the next section, I will describe some of the causes of the difficulties 
typically encountered. These difficulties arise because there are certain things 
which it seems obvious one should try to do in a public theory. Some of these 
things may not be as relevant to the description of human intelligence as they 
appear, and in fact by making them easy to do, we may be making our real task 
more difficult. 
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SOME THINGS WE SHOULD STOP WORRYING ABOUT 


Mechanical Mind 


First, as argued earlier, we should stop worrying whether or not the mind is 
mechanical and simply decide that public theories should be mechanical. Mecha- 
nism has to do with the communicability of theories, not with what the theories 
are about. If we accept this concept then we can turn our attention to a search 
for powerful, economical ways of describing mechanical theories in general. 


Closed Systems 


Once we decide to use public theories that can be given finite, mechanical 
description, we often outdo ourselves by also deciding that theories should only 
be about finite things; we set up permanently closed, self-contained systems 
(mechanisms). It is appropriately modest to agree that we will not try to have 
one mechanical model for the whole universe, but by strictly fixing the outer 
boundaries, it becomes difficult to talk about the interaction between the thing 
a mechanical theory is about and the rest of the environment or universe. In 
modern quantum mechanics for example it is difficult to write out the equations 
for a set of particles without assuming first that there is absoutely nothing else in 
the universe that can ever interact with them. If is difficult to add complications 
as they are encountered. More generally, mathematicians develop some solutions 
to systems of equations in which variables change values, but they have great 
difficulty when the equations themselves change or when new variables and 
equations are allowed to be added—even in a lawful manner. 

Growing machines force our attention to the environment, encouraging us to 
consider seriously where the new variables come from. A growing machine that 
is being simulated on a computer grows by appropriating from the host com- 
puter new unused memory locations. Turing’s machines grow by having an 
attendant in the environment who patches on new lengths of computing tape 
whenever the machine needs more. And one carries out large computations by 
appropriating scraps of paper from the environment that nobody else is using, 
and writing on them. At another level molecules in the blood stream help to 
restructure brain cells. j 

So, instead of talking only about closed fixed systems, we could begin by 
talking about very large, incompletely described systems which will represent an 
environment. The environment could be partially described as a set of variables, 
alphabets, links, operations, and binding processes. In the environment new 
variables are not grown, but rather old ones are reconnected by changing their 
links; a variable depends at one time on one variable and then again on another. 
Now a completely specified subsection of the environment can be described asa 
growing machine whose growth process is simply an environmental binding 
process viewed locally. Some of the growing machine’s variables will be linked to 
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variables outside in the environment and these variables will serve as the 
input-output interface with the environment as suggested in Fig. 5. 

Now within this larger open system we can address ourselves to some interest- 
ing questions. One is, how are environmental variables incorporated into the 
growing machine; how do my brain cells restructure themselves using nutrients 
from their environment, and what does this do to the variables of electrical 
activity in my brain (at another level of discourse). Another interesting question 
is, how does the structure of the environmental variables working through the 
input-output interface change the values and structures of the growing machine 
variables and vice versa; how do we use bits of paper in our computations and 
more generally interact with the environment behaviorally? It is in serious 
preparation for answering this last question (or even asking it intelligently) that 
the ecological approach to psychology insists on a more careful investigation of 
the environmental variables themselves. 


Prediction before the Fact 


Prediction seems bound up with a whole host of mathematical concepts such as 
proof, convergence, stability, and the halting problem for Turing machines. 


/ 

I 

| 

| 

| 

| 

| 

? 

Pool of variables 

available for growth: 
\ Nutritive or support 
N environment. 


FIG. 5. A growing machine embedded in a larger environment. 
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However, an excessive concern for these things is inappropriate for a public 
theory of human intelligence. The plain fact of the matter is that human 
beings are not always predictable, or stable, their behavior does not generally 
converge, and there is only one way to halt them. When a child learns something 
new, its behavior does not converge or settle down or stabilize, but rather its 
behavior moves into new modes of exploration and interaction with the environ- 
ment that never occurred before. 

It turns out that growing machines too are generally unpredictable and 
unstable, their behavior does not always converge, and they halt only if carefully 
supervised, Most investigators of growing machines consider all these as highly 
undersirable properties, and they devote much effort to avoiding them. It seems 
to me these properties make growing machines very attractive possibilities for 
modeling and understanding human behavior. But let me focus on the single 
property of predictability and try to explain why it is unreasonable to expect 
any public theory to predict human behavior (or generally the behavior of things 
which grow). 

First let us consider Class I or Class II public theories. Suppose Mr. Philosopher 
Knowitall approached Mr. Human Perverse and says: 


KNOWITALL: I have a completely successful theory for predicting human 
behavior, 

PERVERSE: I don’t believe it. Prove it to me. 

KNOWITALL: O.K. Propose any situation you like. PII write on a piece of 
paper my prediction of what you will do in that situation. Then we’ll put 
you in the situation, see what you do, and compare the results with my 
prediction. Of course you cannot see my prediction until you’ve done your 
thing. 

Mr. Perverse agrees and after dozens of trials in which Mr. Knowitall unerringly 
predicts his behavior, our transcript resumes: 


PERVERSE: I don’t understand. Even when I acted as I normally wouldn’t, 


you still knew what I was going to do. 
KNOWITALL: Yes. My theory takes into account that you may know I am 


watching and playing the prediction game with you. 

PERVERSE: (admiringly) That’s quite a theory. Could you possibly teach it 
to me? 

KNOWITALL: (unwittingly) Why yes, of course. It’s a public theory. Here’s 
how you figure out what you'll do. .. . 


At this point in this kind of argument there is always a gap where the alleged 
theory is supposedly described. After his instruction Mr. Perverse proposes: 


PERVERSE: (slyly) Let’s test your theory againonme... 


There is little more than literary exercise in recording Mr. Knowitall’s disap- 
pointment when his theory turns out to be 100% wrong now. Since Knowitall’s 
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public theory was at best a state machine, Mr. Perverse was able to compute in 
advance its prediction and so do exactly the opposite. In fact Knowitall’s theory 
couldn’t have worked all the time in the first place if one of the test situations 
led to Perverse learning or guessing the theory. This argument, however, is 
unsatisfying because it all depends on Mr. Perverse finding out about the theory. 
Except at that one particular point, it still might be possible that his behavior 
was predictable. In practice we know that no one can predictably manipulate a 
human’s behavior for long before the human finds out or suspects and reacts 
perversely. To be predictable is sooner or later to be too simple. 

Let us suppose now that Mr. Perverse approaches Mr. Philosopher Knowbetter 
who has a Class III (growing machine) theory of human behavior. 


PERVERSE: Tell me, do you have any good theories about human behavior? 
KNOWBETTER: As a matter of fact I do. 

PERVERSE: (with exaggerated interest) Please do tell me about them, 
KNOWBETTER: Certainly. . . . 


Here too the curtain of the unknown occludes our view. The transcript resumes 
with: 


PERVERSE: Well now, I suggest a little test of your theory’s predictive 
powers OK? I'll propose a situation and you write down your theory’s 
prediction of what I'll do in that situation. Then PII do whatever I do in that 
situation and we’ll see if you were right. 

KNOWBETTER: (smiling) OK. But you needn’t even tell me what the situa- 
tion is. 

PERVERSE: (shaken) What!? 

KNOWBETTER: (whips out a piece of paper) As a matter of fact I already 
have my prediction written down here. 

PERVERSE: But there is nothing written on that paper! 

KNOWBETTER: I know. My theory has nothing to say about how you will 
act. My theory only sets up a working model of a purely hypothetical 
person. 

PERVERSE: But can’t you at least predict what your hypothetical person 
would do? 

KNOWBETTER: Well, yes, but calculating that prediction is exactly equiva- 
lent to setting up the model and letting it run. So by the time I have the 
prediction, my hypothetical person has already done it. In my theory, 
description, prediction and simulation are all the same thing. 

PERVERSE: What if I told your theory about a hypothetical person to that 
hypothetical person? 

KNOWBETTER: Well, then your educated hypothetical person would be quite 
a different person from my original one. 

PERVERSE: Ah. I see. The predictions my educated person could make using 
your theory would only apply to your original hypothetical person but not 
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necessarily to my educated one. I don’t suppose it would do any good to 
describe to my educated hypothetical person all about educated hypotheti- 
cal persons? Or maybe I could let him use your theory to work that out for 
himself? 

KNOWBETTER: No, I’m afraid not. The description would never catch up to 
the person using it. 

PERVERSE: Don’t you have anything to say about real persons? 

KNOWBETTER: Not much. Except that they behave rather like my hypo- 
thetical ones. I’m better at making up hypotheses than I am at finding out all 
the facts anyway. 

PERVERSE: Aren’t you interested in the facts at all; don’t you want to make 
any predications about the real world? Like, how about setting up a 
hypothetical me? 

KNOWBETTER: No. I’m only human. Pd rather decide how things could be 
than find out. Besides, in the process of setting up a good model of you I 
would have to find out a lot about you. Just in the process of asking you 
questions about yourself, Pd be changing you into a different person. Pd 
never keep up with you. And it’s too much work anyway. ld rather set up 
seomthing simple and let it do the work of growing into something complex. 
Even if I cannot predict your behavior, if I could only make something that 
behaved like you, I would be content to rest. I learned quite a bit trying to 
play that prediction game. It’s made a new person out of me. 


THE GROWTH OF HUMAN INTELLIGENCE 


Many people have been building growing machines (that is, theories) for some 
time now, particularly in the fields of logic and computation. In fact there are 
thoroughly mechanical theories about such an advanced human endeavor as the 
process of mathematical proof. Mechanical procedures have been described 
(Robinson, 1965, but see Nilsson, 1971) which will always construct a proof for 
a theorem provided only that the theorem turns out to be true. The only trouble 
is these proofs may take a very long time to construct, which is to say, they may 
not be very elegant. I think this may be due to the fact that we are very timid 
about the growth processes we give to these theories. If we let these things grow 
too fast, we may not be sure they will ever stop growing, converge, stabilize, or 
come to a halt and give an answer. In fact the best mechanisms we have now for 
logic may never halt if given what turns out to be a false theorem. Excessive 
growth is a very real and practical problem for mechanical theorem provers, and 
the logicians, to achieve their goals, must do something to keep their theory 
under control. 

However, the valid concerns of the logicians are not so relevant for a theory of 
human behavior. In particular, for a growing machine simulation of human 
behavior, it is not so important that the thing come to a halt (representing 
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death?), or that it converge (doesn’t a healthy human’s behavior diverge into 
more and more varied forms) or that it stabilize (do humans ever make up their 
minds irreversibly?). So, the growth processes used in growing machines so far 
may not do what a psychologist would like. In the remainder of this section I 
propose to tell everything I think I know about the growth process of human 
intelligence. (It is unbearable resisting the temptation to end this section here.) 


Human Theorem Proving 


Let me begin with the people I have been talking about. What is it that logicians 
are able to do in their proofs that their mechanical theory does not? I think it is 
this: Often a logician will introduce a new symbol, give it precise definition in 
terms of the old symbols, and then using this new symbol, speed on with the 
proof. (More generally, a mathematician will say “Suppose we had a certain 
(function, set, matrix, . . .), call it A, such that... .”) So far, a growing machine 
can do the same. The new symbol and the clumping together of old symbols in a 
certain relation represents intellectual growth for the logician. The new symbol 
is not merely an abbreviation that simply shortens the proof, rather it serves to 
focus the logician’s attention and to restructure his thought. In terms of a 
growing machine, the logician’s linking up of a new variable, or the expansion of 
an operation table by adding a new symbol, may make possible new, more 
economical decompositions. The resulting enlarged system may thus end up 
being more easily communicated, and, in view of the limits on our processing 
capacity (Miller, 1956), more easily used. 

So, human intelligence grows by creating a new variable, or new values an old 
variable may assume, such that the global complexity of the whole system 
increases, but the local complexity of the problem being solved is reduced. The 
real problem is how to know (mechanically) what will reduce local complexity. 
For now my suggestion is simply that growing machines and decompositon may 
give us a formal handle on the problem, and further, a careful psychological 
investigation might give us some content to cast into that form. A few examples 
may help to clarify this conclusion. 


Hierarchies of Theories 


Again and again I see complex theoretical systems constituting themselves in a 
hierarchical structure of levels of discourse as characterized in Fig. 6 (see Simon, 
1969). This structure seems to me likely to have something to do with the way 
human intelligence grows. Take the mathematics-to-psychology hierarchy for 
example. Of course it is too late to watch its growth, and the real history is very 
complex, but evidence of how people developed these theories remains. Speci- 
fically, the particle of physics is theoretically a clumping of interrelated states, 
operators, equations of motion and mathematical properties. The atom or 
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FIG. 6. Hierarchies of theories where a number of constructs at a lower go together to 
make a unit in a higher theory. 


molecule of chemistry is a clumping of closely related particles in physics. The 
living cell of biology is a clumping of molecules in chemistry. Psychobiology is a 
special clump of cells, and psychology is this clump of cells interacting with 
clumps in the environment. And a mathematical concept should be a clump of 
some sort in psychology so the system closes on itself in a circle (which is better 
than dribbling on forever, or mysteriously terminating in unquestioned axioms). 

The transistor-to-operating-systems hierarchy represents the levels of discourse 
for talking about a computer. No one tries to describe a computer as a wiring 
diagram of its transistors, rather the transistors are clumped into modules and 
registers. A discussion of more complex functions treats as units, whole clumps of 
memory registers called routines, programs, or data structures. And a really 
serious discussion talks about operating systems of interacting programs and data 
structures. One nice thing about the computer example is that its development is 
recent and proceeded in fact from bottom (with vacuum tubes or relays for 
transistors) to top. 

In my own work I took the step from cell assembly to short-term memory in 
the third hierarchy of Fig. 6. Arguing with Hebb (1949) that a discussion of 
behavior in terms of individual neurons would be too complex, I postulated 
patterns of neural activity as the basic unit that seemed reasonably definable in 
terms of assemblages of large numbers of neurons throughout the brain. But 
rather than making that definition, I started with cell assembly as a basic simple 
unit, postulated how they would interact and grow, and used this higher-level 
theory to model the development of sensorimotor intelligence described by 
Piaget. Then when I was done, it became clear that my clumps organized 
themselves into larger clumps (or at least I began seeing them that way) that 


206 MICHAEL CUNNINGHAM 


could act as short-term memory for the goal directed behavior that emerges 
towards the end of Piaget’s sensorimotor period (Cunningham, 1972). Quickly 
regrouping, I pushed along towards language, but as Quillian (1972) suggests, it 
is now essential to give a simple mechanical definition of the behavior and 
formation of these larger clumps and use them to describe the yet higher 
functions. 

So, again and again we see humans complicating their problems by inventing 
new terms, ideas, concepts and clumps and ending up with a whole nest of 
complexly interrelated theories instead of just one. But by increasing the global 
complexity we simplify a particular local discussion. 


Piaget's Objectification 


Another example of this complex-simplification process of intellectual growth 
is Piaget’s theory of sensorimotor intelligence. Piaget writes of the coordination 
of basic sensorimotor schemata which could in a growing machine be repre- 
sented as the creation of a new variable with links to the two previously 
independent sets of variables that might represent the schema of the hand and of 
the eye. Piaget asserts that the child attributes objective reality to something 
grasped and seen only after such coordinations have been constructed or learned. 
Here it seems that the coordinating variables are themselves the child’s internal 
representation of an object. In a sense the coordinating variables are the child’s 
hypothesis that there are objects that exist independently of any one of his 
sensorimotor schemata. Such new variables at once complicate matters (since 
anything seen may now be reached for, and anything sensed with the hand may 
now be looked for) but also simplify matters by allowing the two schemata (of 
hand and eye) to function together in a coordinated fashion with objects in the 
environment. There is only one object rather than disparate sights and tactile 
stimulation. 


ARTIFACTS OF INTELLIGENCE 


Presumably, we cannot know with certainty anything except what it is for the 
moment that we choose to believe. So, too, we do not know that anything at all 
is mechanical—except for the mechanical theories we can construct and use in 
our heads. If ecological psychology has a concern for the environmental struc- 
tures that are immediately available to the senses, then the other side of that 
coin must be a concern for the structures that the mind imposes on its 
environment. These complexly structured theories, mechanisms and languages 
are artifacts molded by the mind and I can think of no better data from which 
we might get an impression of the mind than from the things that are solely of 
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its creation. Computers and programming languages provide a particularly good 
example as Papert (1971) suggests. 

Mechanism theorists tell us that it seems every mechanism, algorithm, com- 
putation or public theory can be expressed as some kind of growing machine. 
The puzzle I would propose is this: What is the psychological reality behind such 
things as symbols and variables, linked structures and operations, and growth or 
binding processes? And if there is any, why is it we evolved to think this way? 
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Of the distinction which his own efforts had done much to foster, Magendie 

commented in 1824: 
The organs which concur in muscular contraction are the brain, the nerves, and the 
muscles. We have no means of distinguishing in the brain those parts which are 
employed exclusively in sensibility, and in intelligence, from those that are employed 
alone in muscular contraction. The separation of the nerves into nerves of feeling and 
nerves of motion is of no use: this distinction is quite arbitrary [cited in Evarts, Bizzi, 
Burke, Delong, & Thach, 1971, 111-112]. 


More recently this viewpoint has been expressed in a different but closely 
cognate fashion by Trevarthen (1968): “Visual perception and the plans for 
voluntary action are so intimately bound together that they may be considered 
products of one cerebral function [p. 391].” 

In the light of such remarks, it is curious that theories of perception are rarely, 
if ever, constructed with reference to action. And, while theories of perception 
abound, theories of action are conspicuous by their absence. But it must 
necessarily be the case that, like warp and woof, perception and action are 
interwoven, and we are likely to lose perspective if we attend to one and neglect 
the other; for it is in the manner of their union that the properties of each are 
rationalized. After all, there would be no point in perceiving if one could not 
act, and one could hardly act if one could not perceive. 

Of course, history has not been remiss in comments on the relation between 
perceiving and acting. From the time of Aristotle it has been taught that the 
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motor system is the chattel of the sensory system. Nourished by the senses the 
motor system obediently expresses in automaton and relatively uninteresting 
fashion the cleverly contrived ideas of the higher mental processes, themselves 
offshoots of the sensory mechanisms. In this view, action is interpretive of the 
sensory mind and thus, in principle, problems of coordinated activity are 
secondary to and (if we assume an associative link between sensory and motor) 
independent of problems of perception. It has also been taught, usually with less 
fervor, that perception is a disposition to act: to perceive an event is to be 
disposed to respond in a certain way. Modification of this view leads to a 
constructive theory of mind in which it is argued that higher mental processes in 
addition to perception are skilled acts that reflect the operating principles of the 
motor system. In short, experience is constructed in a fashion intimately related 
to the construction of coordinated patterns of movement. So far as action 
assumes primary importance in this approach to mind, we would expect its 
proponents to put great store by the analysis of coordinated motions. However, 
where motor-theoretic interpretations have been forwarded to account for 
perception and the like, statements of how acts are actually produced have been 
either absent or trivial (e.g., Bartlett, 1964; Festinger, Burnham, Ono & Bamber, 
1967; Liberman, Cooper, Shankweiler, & Studdert-Kennedy, 1967; Sperry, 
1952). Curiously, action-based theories of perception and of mind in general 
have been advanced on a nonexistent theory of action. 

Thus, it seems that the theory of action deserves more attention than it has 
received and that the interlacing of the processes of perceiving and acting is a 
problem we can perhaps no longer afford to ignore. This chapter is a preliminary 
and speculative response to these reproofs. Its purpose is twofold: first, to 
identify a set of basic principles to characterize the style of the action system in 
the production of coordinated activity; second, to describe, in a rough and 
approximate way, how the contents of vision may relate to the processes of 
action. To a significant degree, the ideas expressed in this chapter derive, on the 
one hand, from the work of Nicolai Bernstein (1967) and Russian investigators 
who have followed his intuitions, and on the other hand, from the analysis and 
amplification of the Russian views by Peter Greene (1971a, b). We begin our 
inquiry by illustrating an equivalence between problems of action and problems 
of perception and cognition (cf. Turvey, 1974), 


THE CONSTANCY FUNCTION IN ACTION, 
AND ACTION AS CONSEQUENCE 


A visually presented capital letter A can occur in various sizes and orientations 
and in a staggering variety of individual scripts. Yet in the face of all this 
variation, the identification of the letter remains, for all intents and purposes, 
unaffected. 
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This phenomenon of constancy is not limited to the domain of perception, but 
is equally characteristic of action. Thus, the letter A may be written without 
moving any muscles or joints other than those of the fingers. Or, it may be 
written through large movements of the whole arm with the muscles of the 
fingers serving only to grasp the writing instrument. Or, more radically, one can 
write the character without involving the muscles and joints of either arms or 
fingers, by clenching the writing instrument between one’s teeth or toes. It is 
evident that a required result can be attained by an indefinitely large class of 
movement patterns. 

On examination of the phenomenon of constancy in action we might raise the 
query: How can these indefinitely large classes of possible movement patterns be 
stored in memory? The answer is that they are not. Clearly, I do not have on 
record in memory all possible temporal sequences of all possible configurations 
of muscle motions that write A; indeed, I have yet to perform them and by all 
accounts I never will. The essential question about our A-writing task, therefore, 
can be stated more fundamentally: How can I produce the indefinitely various 
instantiations of A without previous experience of them? 

In response to this question let us turn our attention to linguistic theory. A 
departure point for transformational grammar is that our competency in lan- 
guage is such that we can produce and understand a virtually infinite number of 
sentences. As Weimer (1973) has pointed out, there are echoes of Plato’s 
paradoxes in Chomsky’s (1965) claim that our competence in language vastly 
outstrips our experience with it. Chomsky’s claim is motivated by the observa- 
tion that experience with a limited sample of the set of linguistic utterances 
yields an understanding of any sentence that meets the grammatical form of the 
language. To explain this competency is, for Chomsky (1966), a central problem 
in the theory of language. But, given the points advanced above, the constancy 
function in action is likewise indicative of a competency that exceeds prior 
learning. The child, we may note, learns to write A under conditions that restrict 
her to a small subset of the very large set of A-writing movements. But she is 
able subsequently to write A with practically any movement pattern she 
chooses, that is, she can write A in novel ways. A writing is creative in the sense 
that language is creative. 

The search for a workable account of the creativity manifest in language has 
led transformational grammarians to what can be aptly described as “the 
explanatory primacy of abstract entities” (Hayek, 1969). The idea is that the 
speak—listener has at his disposal an abstract system of rules or principles, 
referred to as the deep structure, that allows him to generate and to understand 
an indefinitely large set of sentences, referred to as the surface structure. This 
distinction, drawn in linguistic theory, between deep and surface structure will 
prove relevant to our analysis of action in two important respects. The first is 
the idea that deep structure is far removed from surface structure, grammarians 
argue that although the deep structure determines the surface structure, it is not 
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manifested in the surface structure. The second is that the child must come to 
determine the nature of the underlying deep structure from a limited experience 
with surface structures. Chomsky and his colleagues assume that the child 
essentially “looks through” the utterances she hears to the abstract form behind 
those utterances. The child is said, therefore, to construct a theory of the 
regularities of her linguistic experience. Similarly, our hypothetical child learning 
to write the letter A must determine from her limited experience with the set of 
A-writing movements a theory of how to write A. Thus, we may conclude that 
the ability to write A in indefinitely various ways is based on procedures that are 
abstract and generative, like the grammar Chomsky has in mind for language. 
Others have sought similar parallels between action and grammar (¢.g., Lenne- 
berg, 1967). 

There is an interesting upshot to this discussion of action constancy. We 
generally say that an abstract representation, a concept, underlies our ability to 
recognize indefinitely various As. Let us call this the perception concept of A. 
Now clearly we may propose that there is an action concept of A underlying our 
ability to write A in indefinitely various ways. So in general are there two 
different kinds of structures, two different classes of concepts—one specific to 
perceptual events, the other specific to action events? In short, is the constancy 
function in perception achieved in ways fundamentally different from the 
constancy function in action? If it is, then the construction of theories of how 
we identify events (see Neisser, 1967)—theories of the perception concept—can 
proceed virtually independent of the construction of theories of the action 
concept. On the other hand, if the constancy function is treated in the same way 
in both perception and action, that is, if there is only one class of appropriate 
structures or only one class of appropriate procedures for achieving constancy, 
then the theory of identification and the theory of production ought not to be 
considered separately. In this view, which I suspect is the more viable, any 
account of constancy in perception must also be an account of constancy in 
production—a perceptual account of constancy must be potentially translatable 
into an action account of constancy. If such a translation is in principle 
implausible then we may suppose that the account is incorrect. 

The reader’s attention is drawn in this preamble to one other important aspect 
of action—its relation to “consequence.” An act modulates environmental 
events, but philosophers have found that they cannot conceptually distinguish 
between occurrences that are actions and occurrences that are consequences (see 
Care & Landesman, 1968). A typical argument from language usage might go 
like this: George kicks the football (of the round kind) and scores the goal that 
wins the championship. Now we could say that George kicked the football and 
that a consequence of his action was that a goal was scored. Or we could say, 
just as appropriately, that George scored a goal with championship-winning 
consequences. “Scored the goal,” therefore, can be viewed either as consequence 
or as action. We may wish for criteria to determine which occurrences should 
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receive an action label, and which occurrences should receive a consequence 
label. Unfortunately, the criteria that have been advanced have not met with any 
degree of universal approval. 

The failure to distinguish conceptually between action and consequence is 
understandable from the viewpoint of Bernstein (1967). He comments: 


Whatever forms of motor activity of higher organisms we consider . . . analysis suggests 
no other guiding constant than the form and sense of the motor problem and the 
dominance of the required result of its solution, which determines, from step to step, 
now the fixation and now the reconstruction of the course of the program as well as the 
realization of the sensory correction [p. 133]. 


The implication is that an action plan as a statement of consequences is not a 
static structure but a structure that is by virtue of processes we will discuss 
below, continually becoming. Yet in all of its phases of change, phases that 
constitute a tailoring of the plan to the current kinematic and environmental 
contingencies, the essential character of the action plan remains invariant. 
What is to be achieved, what is to be consequence of the evolving pattern of 
motions, persists from the conception of an act through its evolution to its 
completion. 

The arbitrariness of distinguishing between action and consequences parallels 
the arbitrariness of distinguishing between perception and memory. As William 
James (1890) observed and as other concur (e.g., Gibson, 1966a), the travel- 
ing moment of present time is not a razor’s edge and no one can identify when 
perception ends and memory begins. The distinction between action and conse- 
quence is as much a will-o’-the-wisp as the distinction between perception 
and memory. 


THE DOMAIN OF ACTION CONCEPTS 


For present purposes we will entrust ourselves to the view of concepts as 
functions (Cassirer, 1957). Thus, we may represent an action concept such as 
that for A writing as A(x) and explore the nature of the variable x that enters 
into this function. We perform this exercise in order to identify some funda- 
mental characteristics of the action system. Let us assume that the elements 
entering into A(x) are a proper subset of the set of elements that enter into any 
rule for coordinated activity. And, in addition, that coordinated activity is under 
the management of an “executive system” and that the character of the 
elements entering into A(x) and any other action function are mirrored in the 
character of (or constraints on) this system. 

One view of the executive is that expressed in the traditional piano or 
push-button metaphor. In this metaphor muscles are represented cortically in 
keyboard fashion, one muscle per key, and central impulses to the muscles are 
held to be unequivocally related to movement. The essence of the view is that 
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the executive instructs each muscle individually. At the outset we may question 
the worth of this metaphor simply on the ubiquity of reciprocal innervation: 
The intricate and extensive interrelation among muscles makes it both arduous 
and wasteful to instruct them singly. But more importantly, we can argue (as did 
Bernstein, 1967) that there cannot be an invariant relation between innerva- 
tional impulses and the movements they evoke. 

Consider the movement of a single limb segment in relation to a fixed partner 
and under the influence of a single muscle. The differential equation describing 
this situation is of the form 


Id? ajdt? = f(E, a, da/dt) + g(a), 


where J is the inertia of the limb segment, æ is the angle of articulation, £ the 
innervational level of the muscle, and f and g are the functions determining 
respectively, the muscle force and gravitational force acting on the limb segment. 

If we take E = E(a, da/dt), that is, independent of time and simply a function 
of position and velocity, then the equation reduces to that for a movement of a 
limb indifferent to central influences; iri brief, an instance of central paralysis. If, 
for contrast, we assume that the excitation of a muscle is solely a function of a 
centrally predetermined sequence and independent of the peripheral variables of 
position and velocity, that is, Æ = E(t), then the equation is that of a system 
insensitive to, or ignorant of, changes in local conditions. Obviously, it is more 
judicious to argue that Z = A(t, a, da/dt), in which case the fundamental 
equation can be written 


IW ajdt? =f [E(t, a, da/dt), a, da/dt] + g(a). 


Solutions to equations of this kind depend on the initial conditions of integra- 
tion. The implication, therefore, is that in order to obtain the same movement 
for various values of a and da/dt, different innervational states £ will be needed. 
In a word, the relationship existing between impulses to the muscle and the 
movement of the single limb segment is equivocal: same impulses may produce 
different movements and different impulses the same movement. 

We continue Bernstein’s argument by noting that in the temporal course of 
moving a limb segment changes occur in the force of gravity (which is related by 
a function g(a) to the angle of articulation) and in other external forces 
operating on the limb and that these changes affect Æ. Now suppose that the 
limb segment traces out a thythmical motion. This rhythmical motion can be 
identified with a function relating the required forces at the joint to time. 
However, another function can be identified relating forces at a joint to time, 
and the forces in this case correspond to the changes in the external force field. 
As a result, the sequence of impulses to the muscle can be interpreted as 
determining a mapping of the function generated by the variations in the 
external field over time to the desired function. Now suppose that the same 
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rhythmical motion is traced out with the hand holding on separate occasions (a) 
a hammer, (b) a baton, and (c) a can of beer. The function relating the changes 
in the external force field to time will differ in each instance even though the 
pattern of the rhythmical movement is unchanged. In each of the three instances 
a different mapping would be required from the function generated by the 
external force field to the desired function specifying the rhythmical pattern. 
The import of this, as Bernstein (1967, pp. 20-21) points out, is that the 
sequence of impulses to the muscle “cannot maintain even a remote correspon- 
dence” to the factual form of the movement. 

A third criticism of the push-button metaphor is that if the executive behaved 
in the fashion suggested, instructing each muscle individually, then it would be 
called upon to manage the enormous number of degrees of freedom that the 
motor apparatus attains 

...both in respect to the kinematics of the multiple linkages of its freely jointed 

kinematic chains, and to the elasticity due to the resilience of their connections—the 

muscles, Because of this there is no direct relationship between the degree of activity of 

muscles, their tensions, their lengths, or the speed of change in length [Bernstein, 1967, 

p. 125]. 


Herein lies a fundamental principle which simply states that the number of 
degrees of freedom of the system controlling action is much less than the 
number of mechanical degrees of freedom of the controlled system (Kots, 
Krinsky, Naydin, & Shik, 1971). A homely example illustrates the point: try 
writing a letter, for example, W, while simultaneously making circular motions 
with a foot. An experimental illustration is provided by Gunkel (1962): when 
one makes movements of different rhythms simultaneously with the two hands, 
the amplitude of the movements performed by one of the hands is modulated by 
the frequency of the movements performed by the other. Thus, it is not difficult 
to demonstrate that the number of degrees of freedom of the executive is very 
small; on the push-button metaphor it would have to be very large. We can 
conclude, therefore, on three counts, that the executive does not, or indeed 
cannot, control individually each motor unit or even each muscle participating in 
a complex act. 

One consequence of the conclusion that in the course of performing a 
coordinated pattern of movements the executive system does not control mus- 
cles singly is that it need not be apprised of peripheral details, since such 
information would be irrelevant. In this light let us take another look at the 
equation for the movement of a single-limb segment. In that equation the 
innervational impulse is expressed as a function of time, angle of articulation 
(muscle length), and velocity, that is, £ = E(t, a, da/dt). But if the executive is 
stripped of the responsibility for instructing individual muscles and if it is 
ignorant of the current, precise details of the external force field then clearly 
executive instructions are not written in the form relevant to that field, that is, 
in the form E(t, a, da/dt). 
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Moving a single-limb segment rhythmically requires an action plan and we may 
suppose that executive instructions spell out that plan (in the sense of defining 
the contours and timing of the movement) through a sequence of impulses of 
the form A(t). The action plan and impulses of the form A(t) must correspond, 
or so it would seem, to the factual form of the movement, in contrast to 
impulses of the form E(t, a, da/dr), which on the above account bear no such 
relationship to the movement. Thus, we see that the action plan (the deep 
structure) is dissimilar to the innervational signals issued to the muscles (the 
surface structure) and these signals in turn are dissimilar to the movement that 
evolves: “...it is as if an order sent by the higher center is coded before its 
transmission to the periphery so that it is completely unrecognizable and is then 
again automatically deciphered” (Bernstein, 1967, p. 41). In general, if impulses 
of the form E(t) are close to the action plan and hence close to the actual form 
of the movement, then those impulses of the form E(t, a, da/dt) are close to the 
muscles and to the actual forces operating at the joint complexes. On this view, 
the mapping of E(t) to E(t, a, da/dt) identifies the evolution of an act; in 
particular, it identifies the adaptation of an action plan to the prevailing field of 
external forces. 

But if the executive does not control individual muscles, then what does it 
control? In response to this question, students of action (e.g., Bernstein, 1967; 
Gelfand, Gurfinkel, Tsetlin, & Shik, 1971) propose that the executive charge is 
to control the modes of interaction of lower centers. These, it is argued, are 
capable, through the systems that they govern, of producing a coordinated 
movement pattern in a relatively autonomous fashion. 

‘Consider a commonplace, coordinated activity such as running. There are 
lower centers that control individual limbs, with each center asserting particular 
relations among the components of the limb that it controls. Thus, the inter- 
action between these centers determines the coordinate motion of the limbs, and 
the problem of coordination in running becomes for the executive a problem of 
intercenter coordination (Shik & Orlovskii, 1965). Let us pursue this example in 
more detail because it is representative of a mode of organization that we will 
entertain as characteristic of the action system. 

We have evidence that mechanisms inherent in the segmental apparatus of the 
mammalian spinal cord can initiate and maintain flexion-extension or stepping 
movements of the limbs in the absence of afferent participation (Eldred, 1960). 
Apparently, these segmental pattern generators determine the fundamental 
form of flexion-extension activity, but they do not specify in detail the actual 
spatial and temporal characteristics of the motion (Engberg & Lundberg, 1969). 
It is the role of afferent information, enumerated through autonomous (reflex) 
structures (and of tuning influences from above, as we shall see later) to supply 
the requisite spatial and temporal details and thus to tailor the basic pattern to 
the field of external forces. A small leap now takes us to the assertion that 
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walking and running can be attributed to a relatively simply executive instruc- 
tion which sets into charcteristic motion the entire segmental apparatus and 
which, in itself, is deficient in information about the actual strategic order of 
necessary muscle contractions (cf. Evarts et al., 1971). 

This mode of organizing action achieves the following. First, it resolves the 
degrees of freedom problem noted above by apportioning relatively few degrees 
to the executive level but relatively many to the subsystems whose activities the 
executive regulates. (Since it is the subsystems that must deal with the vagaries 
of kinematic linkages and muscles). Second, and related, it reduces the detail 
required of the executive instructions, for with autonomous lower centers those 
instructions do not have to be coded for the individual muscle contractions that 
will ultimately occur. 

In overview, what has emerged is the understanding that the element entering 
into the design of an act is typically not an individual muscle but a group of 
muscles functioning cooperatively together. We have good reason to speculate 
that the reflexes may well comprise the “basis” of the set of all such functional 
groupings and hence of the infinitely large set of all acts (Easton, 1972a). A 
“basis” is a mathematical structure found in the theory of vector spaces. It is 
defined as a linearly independent (nonredundant) set of vectors that under the 
operations of addition and scalar multiplication spans the vector space. Essen- 
tially, a “basis” contains the minimum number of elements that are required to 
generate all members of the set. 

We have several reasons for identifying the set of reflexes as the “basis” for 
action. First, reflex systems are not independent entities that function in 
isolation. On the contrary, there are a multiplicity of functional relations among 
reflexes and other structures. Second, virtually every reflex observed experi- 
mentally and clinically is an instance of a reasonably complex configuration of 
motions often elicitable by a single stimulation. Third, reflex systems are under 
very effective and often complex control by supraspinal structures (cf. Eccles & 
Lundberg, 1959; Evarts et al., 1971; Kuno & Perl, 1960). And fourth, reflexes 
are obviously purposeful and adaptive, and they may be organized and modu- 
lated flexibly by means of the operations of ordering, summing, fragmentation, 
and through their “local sign” properties (Easton, 1972a). Collectively, these 
characteristics of reflexes suggest that 

...the neuronal mechanisms which have been studied as reflex arcs can be utilized in a 

variety of ways by virtue of the interaction between reflex pathways and by the action 


of control systems that are present, even at the level of the spinal cord segment. The 
dichotomy between reflex control and central-patterning control of movement may in 


this sense be artificial [Evarts et al., 1971, P- 62). 


Through the provision of reflexes, evolution has supplied a partial answer to 
the degrees of freedom problem. We might now suppose that a further reduction 
in the burden of control is achieved ontogenetically through the gathering 
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together of reflexes into larger functional units (cf. Gelfand et al., 1971; Paillard, 
1960; Pal’tsev, 1967b). We will refer to reflexes and functional combinations of 
reflexes as “coordinative structures” (a term borrowed from Easton (1972a) but 
used here with greater latitude).! Of cardinal importance to this chapter is the 
assumption that a closely knit functional combination of reflexes performs as a 
relatively autonomous unit; by this assumption, relative autonomy is a funda- 
mental property of coordinative structures, at any level of complexity.” 

In sum, we have seen that the executive does not construct acts from 
individual muscle contractions. What we now infer is that acts are synthesized 
from a set of coordinative structures for which the reflexes constitute a basis. 

We return now to the question of the variable entering into action concepts of 
the form A(x). The executive does not deal in muscles, so muscle properties 
(length, tension) can be ruled out. The executive does deal in coordinative 
structures (at least so we may argue), but these similarly cannot be the elements 
we seek. An action concept such as that supporting A writing is indifferent to 
functional groupings of muscles in the same way that it is indifferent to 
individual muscles. However, the analogy drawn above between the set of 
teflexes and a “basis” in vector space theory provides a clue to the answer, To 
reiterate: A basis is a subset of a set of elements which when acted upon by 
Suitable operations generates the entire set of elements. We assume, therefore, a 
repertoire of operations that modify and relate the coordinative structures so as 
to produce any and all acts. Thus, we may conjecture that the elements entering 
into an action concept are the operations defined over the set of coordinative 
structures. In this sense an action concept is analogous to a mathematical 
operator, a function whose domain is a set of functions, of which differentiation 
is a classical example. 


"One motivation for bringing reflexes and functional combinations of reflexes under the 
single heading “coordinative structures” is the assumption that for the activation of either a 
single reflex or a single functional combination of reflexes, one degree of freedom of the 
control system is enough (see Kots et al., 1971). In regard to functional combinations it is 


and their establishment as single functional units. In very large part acquiring a skill is, as 
Bernstein (1967) would have expressed it, a problem of teducing the degrees of freedom in 
the action structures being regulated (see Kots & Syrovegin, 1966). 

* Consider insect flight. The evidence Suggests that it is not due to a built-in structural 
system of simple segmental reflex loops nor to any flight center, yet identified. Rather, it 
seems that there is a functional system of distributed oscillators—autonomous pattern 
generators-which on receipt of the appropriate nonphasic input are coupled together as a 
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THE ORGANIZATION OF THE ACTION SYSTEM 


The foregoing account identifies two particularly important properties of the 
action system. First, acts are produced by fitting together structures each of 
which deals relatively autonomously with a limited aspect of the problem. 
Second, the action plan is stated crudely “in three-dimensional kinematic lan- 
guage” (Gelfand er al., 1971), yet the actual pattern of motions is precise in 
displacement, speed, and time of occurrence. To achieve this measured perfor- 
mance the differentiation of an action plan must proceed through multiple 
stages of computation in which needed details emerge gradually. Patently, a 
computation of details over time is inelegant and inefficient for a system that 
has a limited repertoire of skills, but it is preferred for a system called upon to 
solve novel action problems posed by ever-varying kinematic and environmental 
conditions. 

We commonly classify a system that behaves in this fashion as hierarchic, a 
classification that is certainly suggested by the unqualified use of the term 
“executive” in the preceding discussion. By a hierarchy we understand that an 
executive at the highest level of a decision tree makes the important decisions 
and spells out the fundamental goals. Decisions on the details are left to the 
immediately subordinate structures which in turn leave decisions that they 
cannot make, for whatever reason, to even lower structures. This general strategy 
is repeated until the final remaining decisions are made by the lowest structures 
in the decision tree. 

The crucial property of a substructure in a hierarchy is that in the perspective 
of a higher level it is a dependent part, but in the perspective of a lower level it is 
an autonomous whole. Koestler’s (1969) term “holon” expresses this whole-part 
personality of hierarchic substructures, a holon is defined as “a system of 
relations which is represented on the next higher level as a unit, that is, a 
relatum” (Koestler, 1969, p. 200). We may question, however, the notion 
explicit in the concept of a hierarchy that the direction of the whole-part 
personality of substructures is immutable. Certainly from the viewpoint of the 
“geometry” of anatomical arrangements certain structures may appear as depen- 
dent parts of other structures, and a compelling argument may well be made for 
the immutability of this relation in the peripheral reaches of the neural mecha- 
nisms supporting action and perception. Yet from a computational viewpoint in 
which we emphasize “knowledge” structures rather than anatomical structures 
the relation between any two structures need not be fixed; either may treat the 
other as a relatum, or subprocedure, depending on the problem to be solved at a 
given moment. This commutability of “subordinate” and “executive” roles, of 
“lower” and “higher,” is expressed in the related interpretations of biological 
systems as “coalitions” (Reaves, 1973; Shaw, 1971; von Foerster, 1960) or 
“heterarchies” (Minsky & Papert, 1972). In these interpretations, management 
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of the action system would not be the prerogative of any one structure; many 
structures would function cooperatively in the framing of action plans and 
desired consequences, although not all structures need participate in all decisions 
(Reaves, 1973). Furthermore, while it is certainly the case that the action system 
has very definite and nonarbitrary (anatomical/computational) structures, in 
these interpretations the partitioning of these structures into agents and instru- 
ments and the specification of relations among them is arbitrary. Any inventory 
of basic constituent elements and relations is equivocal (Reaves, 1973). Decen- 
tralization of control and arbitrariness of partitionings are not alien notions to 
students of action theory (e.g., Berstein, 1967; Greene, 1971b) as evident from 
Greene’s apologia: 
The ‘executive’ and ‘the low-level systems’ will occur frequently ... These terms are 
simply abbreviations for what I really mean: any two subsystems, one of which, at the 
moment, in respect to the task under consideration, is behaving like an executive relative 
to the other. The systems are not unique, and their relation is not immutable: a ‘lower’ 


part of the nervous system might, for instance, at some time behave like an executive 
relative to some higher part [Greene, 1971b, pp. 2-3]. 


ACTION AS HETERARCHIC 


Perception and action contrast in that the tasks of the former are to digest, 
abstract, and generalize, while the tasks of the latter are to spell, concretize, and 
particularize (Koestler, 1969). One is the mirror image of the other. For the sake 
of argument and to facilitate comparisons with perception, let us say that the 
“input” to the action system is an intention (e.g., to pick up a cup, to write 
one’s name). (We respectfully ignore the problem of how an intention is 
determined and in addition we give due recognition to the likelihood that some 
of the structures responsible for determining an intention may also be responsi- 
ble for its translation into an action plan and for the plan’s subsequent differen- 
tiation), Therefore, an intention is an “event” for the action system in the way 
that, say, a scene is an event for the visual, perceptual system. 

Taking a leaf from artificial intelligence research on visual perception, we may 
say that action involves knowledge domains or abstract representations—where a 
representation is defined as a set of entities, a description of the relations among 
them, and a description of their attributes (Minsky & Papert, 1972; Sutherland, 
1973). Thus, for the perception of scenes portrayed in two dimensions we may 
identify, as examples, (1) a Lines Domain in which “bars, picture-edge, vertex, 
end, midpoint” are the entities; “join, intersect, collinear, parallel” are the 
relations; and “brightness, length, width, orientation” are the attributes; and (2) 
a more abstract Surfaces Domain where “surface, corner, edge, shadow” are the 
entities; “convex, concave, behind, connected” are the relations; and “shape, 
tilt, albedo” are the attributes (see Sutherland, 1973). From a hierarchical view 
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we might think of perception as an ordered sequence of unidirectional mappings 
from less abstract to more abstract representations and the differentiation of an 
intention as the successive mappings of the intention onto a series of progres- 
sively less abstract representations. But the argument from the coalitional/ 
heterarchical interpretations of organization is that the conversation between 
abstract representations (domains, knowledge structures) is not one way. A 
fundamental result in artificial intelligence research on scene analysis is that 
while it is necessary to construct descriptions in many different domains, a 
procedure that exploits only unidirectional mapping from a lower domain to the 
next and higher domain is significantly limited in its capability to interpret a 
scene successfully (Sutherland, 1973). Success in scene interpretation is greatly 
enhanced by allowing a more flexible strategy in which processing in lower 
domains can use, as subprocedures, hypotheses generated about structures in 
higher domains (e.g., Falk, 1972). 

Let us comment briefly on entities that in theory could be gathered together 
to form domains in action. On the basis of what has already been said it would 
be logical for us to identify the entities in a representation with coordinative 
structures. In this regard it is important that reflexes can be arranged on a scale 
from complex and wide-ranging to simple and local. The organization of reflexes 
reveals “parallel hierarchies of complexity whose regularity and order leave little 
to be desired: local spinal reflexes, such as the flexion reflex, appear to be 
subsumed by reflexes requiring an intact spinal cord such as the scratch and long 
spinal reflexes, and these in turn are subsumed by pontine and medullary 
reflexes, such as the tonic neck and labyrinthine reflexes, and, at still higher 
levels, by locomotion and righting” (Easton, 1972a, p. 593). This suggests that 
we might equate the entities in each abstract representation of an act with 
coordinative structures, remarking that in higher domains an action plan is 
represented by functions defined over a relatively small number of large and 
complex coordinative structures and in lower domains by functions over a 
relatively large number of small and simple coordinative structures. We are thus 
provided with the following description of the evolving act: an act evolves as the 
mapping by a coalitionally /heterarchically organized system of an intention onto 
successively larger collections of increasingly smaller and less complex coordina- 
tive structures, with each representation approximating more closely the desired 
action. 

There are many other ways we might conceivably characterize the entities of a 
representation in the building of a theory of action, but I hope the arguments 
that follow will pinpoint the special advantages conferred on a system in which 


the entities at all levels of representation are relatively autonomous structures. 


At all events, let us simply note at this juncture two contrasts between action as 
e contrast is between the 


hierarchy and action as coalition/heterarchy- In one, the à 
hierarchic strategy of a detached higher level dictatorially commanding lower 
levels and the heterarchic strategy of procedures constructing a representation in 
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a higher domain entering into “negotiations” with lower domains in order to 
determine how the higher representation should be stated. In the other, the 
contrast is between the hierarchic principle of low-level structures unquestion- 
ingly responding to high-level instructions and the heterarchical principle of 
procedures establishing a representation in a lower domain reprocessing higher 
representations from the perspective of the special kinds of knowledge available 
to the lower domain. 

The anatomical and neural structure of mechanisms related to movement 
suggests quite strongly that the fluidity called for by coalitional/heterarchical 
organization, the constant shuttling back and forth between domains, is not 
without basis. Consider, for example, the notion of internal feedback. Most 
generally the idea of feedback in behaving organisms is identified in two senses. 
In one sense, it is information that arises from the muscles as a direct conse- 
quence of their being active; in the other sense, it is information originating 
outside the organism as an indirect consequence of muscular contraction. The 
latter is often dubbed “knowledge of results.” These senses of the concept of 
feedback are not exclusive, for they omit the afferent information that arises 
from structures within the nervous system in the course of an act’s emergence. 
We refer to this feedback from the nervous system to itself as internal (Evarts et 
al., 1971) and it plays a central role in the evolution of coordinated activity. 

In reference to internal feedback from spinal centers, Oscarsson (1970) has 
remarked on the fact that a number of ascending pathways (at least six 
spinocerebellar tracts) are not especially well equipped to provide information 
about muscle contraction. Rather, the organization of these ascending paths 
Suggests that they monitor activity in spinal motor centers, which in turn 
provide an abstracted account of the relation between themselves and other 
lower centers. This property of ascending paths fits the character of descending 
paths: most descending fibers terminate in interneuronal pools rather than 
passing directly to motor neurons. The basis for this arrangement may lie in the 
fact that the coordination of movement rests on the patterning of groups of 
motor neurons rather than on instructions to individual units, and the mapping 
between domains consists of predictions of how functional groupings of muscles 
(coordinative structures) will behave (cf. Arbib, 1972). Spinal centers thus 
provide a means for checking predictions against the current status of lower 
centers. Therefore, interneuronal pools may function as “correlation centers” 
(Arbib, 1972) reporting the degree to which an action plan is evolving as desired 
or indeed capable of evolving in the desired manner from a particular representa- 
tion. At all events, there are probably many such internal feedback loops 
broadcasting the state of each level of the actor from executive to muscle (Taub 
& Berman, 1968), a highly desirable state of affairs from the perspective of a 
strategy in which executive procedures draw rough sketches and low-level 
procedures furnish needed details. 

Our appreciation of the flexible relations between neuroanatomical structures 
supporting action is fostered further by recognition of the fact that signals from 
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above can bias the abstracted accounts supplied by spinal centers. Many supra- 
spinal mechanisms exert influences on the first synapse in ascending systems, that 
is, the synapse between the peripheral afferent neuron and the second order 
neuron which crosses the spinal cord to the tracts projecting to the brain (Ruch, 
1965a). These influences from above are exerted mainly by motor areas and 
motor tracts, including the classically defined principal motor tract, the pyrami- 
dal (corticospinal) tract. 

Current deliberations on the interrelations among the motor cortex, basal 
ganglia, and cerebellum may well be resolved on the acceptance of the coali- 
tional formulation (see Kornhuber, 1974). We know that before the first signs of 
muscle innervation relevant to a particular movement significant changes occur 
in the activities of the cerebellum and basal ganglia, in addition to the motor 
cortex (Evarts, 1973; Evarts et al., 1971). This contrasts sharply with more 
traditional interpretations of basal ganglia-cerebellar processes operating as move- 
ment control and error-correcting devices coming into play only after the 
innervation of muscles. Rather, it would seem that these mechanisms gang 
together in the constructing and differentiating of action plans—they incorporate 
different procedures, each using the others as subprocedures as the situation 
demands. The structure of the cerebellum and its relations with other structures 
exemplifies the flexibility of neural computation in action. The cerebellum 
receives inputs from the entire cerebral cortex, projects to the motor cortex. 
(Evarts & Thach, 1969), and is in two-way communication with the segmental 
apparatus of the spinal cord and thus with the structures that will actually 
execute the intended configuration of motions. 

Thus the cerebellum can operate as a comparator, relating information about 
cerebral events to information about spinal events. The argument has been made 
that the cerebellum carries out a speeded-up differentiation of representations of 
the action plan, thereby providing a projection of their outcome and a basis for 
their modification. On this argument the cerebellum plays a significant role in 
tailoring action plans to prevailing environmental and kinematic conditions prior 
to their realization as muscle events and thus prior to feedback from muscle 
contraction (see Eccles, 1969; Kornhuber, 1974). 


EXECUTIVE IGNORANCE: 
EQUIVALENCE CLASSES AS INSTRUCTIONAL UNITS 


Clearly, coalitional/heterarchical organization is far more flexible than hier- 
archical organization. Yet this flexibility is constrained in important ways. For 
example, in action there would be limits on the depth to which procedures 
constructing a representation on a higher domain may go in search of useful 
hypotheses. For any given higher abstract representation of an intention the 
utility of knowledge about any lower domain would be inversely related to the 
degree of abstraction separating the two domains. Hypotheses about individual 
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alpha-gamma links (the smallest coordinative structures) regulating muscle con- 
traction, for example, would not be useful to the determination of relevant large 
coordinative structures and related functions. And this, of course, is no more 
than a restatement of the degrees of freedom problem noted above. It then 
follows that while a representation of an intention in a higher domain is mapped 
into an immediately lower domain, the particular form that the representation 
will actually take in the lower domain cannot be known in advance, for the 
procedures operating in the lower domain have access to knowledge that is 
immaterial, in principle, to the procedures in the higher domain. 

This form of “ignorance” has been duly recognized by students of action. We 
recall the earlier comment that the role of the executive (which is understood to 
be not a single neuroanatomical structure but a set of procedures engaging a 
number of neuroanatomical Structures) is to modify the mode of interaction 
among elements at a lower level (Bernstein, 1967; Gelfand et al., 1971). As a 
general rule, however, it is argued that the executive does not have advance 
knowledge of which particular state, out of a set of possible states, a lower level 
will arrive at after a mode of interaction has been specified (Greene, 1971b; 
Pyatetskii-Shapiro & Shik, 1964). In this perspective Greene (197 1a) asks: “Can 
there be units of information that behave deterministically, even though the 
executive can rarely specify control functions more narrowly than to place them 
within broad classes of possible realizations? [p. 22].” Consider a situation in 
which the executive specifies a function transferring a given system into a 
“model” state. Now we may say that the “model” state serves not as a binding 
decree to be followed dogmatically by the system but rather as the identifier of 
a “ballpark,” that is, an equivalence class of states convertible into the “model” 
state. For the system two states are defined as equivalent if they differ by a 
transformation that is realizable by the system. To Greene’s (1971a, b) way of 
thinking the interconverting of states or functions is characteristic of low-level 
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degree of uncertainty in executive commands, since for any given system the 
executive is specifying an unknown member of a family of possible functions or 
states. “This uncertainty introduces ambiguities and errors in an executive 
system’s memory, commands, and communications to other executive systems” 
(Greene, 1971b, pp. 4-11). And we must expect these ambiguities and errors to 
be propagated through the action system during an act’s evolution. The question 
arises, therefore, of how a configuration of motions is coordinated with preci- 
sion and finesse? Indeed, in the face of this apparent chaos we should ask how 
coordination can be achieved at all? We can only assume that the action system 
is so constructed and its procedures so related that these ambiguities and errors 
are immaterial to the differentiation of an action plan (cf. Pyatetskii-Shapiro & 
Shik, 1964). 

For Greene (197la, b) the answer lies in the relations among the various 
equivalence classes: even though errors induced in one class may lead to 
erroneous specification in another, that specification would still be confined 
within the equivalence class of the desired function. Thus the equivalence classes 
as invariant units of information provide a means for specifying instructions in 
terms that are reliable and intelligible, even though an executive system is 
ignorant of the desultory character of low-level systems. We may summarize 
with Greene (1971a): 

Roughly speaking the equivalence classes serve as ballparks into which it is sufficient for 
the executive to transfer the state: once the state enters the ballpark it will be 
automatically brought to the correct position without further attention-although 
ambiguities inevitably lead to erroneous signals, these signals will never be moved 
outside their correct ballparks or equivalence classes. Hence the equivalence classes seem 
to be systematically behaving units of information in situations in which the individual 
elements themselves will behave in haphazard fashion [pp. 24-25] .* 


THE ACTION PLAN AND THE ENVIRONMENT 


Taking stock of our analysis thus far, we may draw a rough sketch of how an 
action plan is represented in the highest domain, namely, as the specification ofa 
subset of large coordinative structures that almost fits what is intended, and a set 
of functions on that subset (identifying the necessary equivalence classes) that 


>From what has already been said, it is evident that the derivation of a pattern of motions 
from its underlying representation is the cumulative result of the application of a long series 
of “rules.” We should suppose, therefore, that there are regularities in the representation of 
an act in the highest domain that are obscured at the movement level by the application of 
these rules. In part, Greene’s (1971a) equivalence classes are an attempt to recover the 
regularities, and the rules of the action system are defined by the conditions underlying the 
change in identity of functions and states, that is, interconverting of elements within a class. 
Obviously the enterprise undertaken by Greene has much in common with current ap- 
Proaches to problems in linguistics. 
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will relate its elements both adjecently and successivly in a particular way. Thus, 
the serial nature of an act is said to arise not from the extemporaneous linking of 
component motions but rather from the differentiation of an already formed 
plan (cf. Bernstein, 1967; Evarts et al., 1971; Lashley, 1951; Pribram, 1971). We 
have not, however, made any comment on the relation between the action plan 
and perception. To rectify this omission we return once again to the nature of an 
action concept, more precisley, A(x). We do so on the following rationale: If 
some common ground between the action concept A(x) and its perceptual 
counterpart can be identified, then perhaps we can gain some perspective on the 
relation between the action plan and the perceived environment in which the 
action plan is to unfold. 

Consider, much as we did before, a sample of As written by the same 
individual using different muscle combinations, that is, one A may have been 
written by small motions of the fingers, another by large motions of aleg with 
the writing instrument grasped between the toes. Members of the sample will 
differ metrically: They will probably be of different sizes, of varying orienta- 
tions, and of differing degrees of linearity, that is, some will be written in curved 
strokes, while others will be virtually straight. And supposedly all members of 
the set will differ spatially in that they will occupy different locations on the 
page. On inspection we would probably have little difficulty identifying each 
members as an instance of capital A. But in what sense are they equivalent? In 
geometry, figures are defined as equivalent with respect to a group of trans- 
formations. We say that two figures are equivalent if and only if the group 
contains a transformation that maps one figure onto the other. The group of 
transformations relevant to this discussion, that is, relevant to our sample of As, 
is clearly nonmetrical and, by elimination, must be topological. It is nonmetrical 
properties rather than metrical properties (which would be left undisturbed by 
the metric groups: the group of motions, the similarity group, and the equiareal 
group) that are of significance to the perceptual determination of membership in 
the class of capital A. 

But the same token, the action concept supporting A writing is determined by 
nonmetrical properties rather than metrical properties. After all, the sample of 
As we are considering was the product of an actor, and the sample, as we have 
noted, is indifferent to metrics. Since this is no more than a paraphrase of an 
argument by Bernstein (1967) we should let Bernstein draw the relevant conclu- 
sion concerning the action concept of A: 


The almost equal facility and accuracy with which all these variations can be performed 
is evidence for the fact that they are ultimately determined by one and the same higher 
directional engram in relation to which dimensions and position play a secondary 
role... the higher engram, which may be called the engram of a given topological class, 
is already structurally extremely far removed... from any resemblance whatsoever 
to the joint muscle schemata; it is extremely geometrical, representing a very abstract 
motor image of space [p. 49]. 
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In short, the action concept of writing A and the perception concept for 
identifying A share common ground in their dependence on nonmetrical proper- 
ties, and it is not difficult to imagine an isomorphic relation (of some degree) 
between them (Turvey, 1974). But this is a very special case and we may ask: 
Does the action plan in general relate in similar fashion to the perceived 
environment? Bernstein (1967) hazards the guess that it does. For him the 
high-level abstract representation of an action plan may be construed as a 
projection of the environment relevant to the intention, where this projection 
relates to the environment topologically but not metrically. 

Owing to the vagueness of this argument we may feel that we have not really 
acquired any new insights (after all, what does it mean to talk about an 
isomorphism between action plans and environmental events?) Yet honoring our 
eccentricities for the present we may acknowledge that we have reinforced our 
respect for the action plan and the action coalition/heterarchy. In earlier 
arguments we established the fact that the high-level abstract representation of 
an action plan was not a projection of muscles and joints. In the current 
argument we maintain that an action plan may be usefully construed as a 
projection of the environment. Therefore, we view the task of the action 
coalition/heterarchy as that of translating an abstract projection of the environ- 
ment into joint-muscle schemata. 

Some research by Evarts (1967) is of special relevance to these speculations. 
Evarts showed that when a monkey makes a movement of the wrist to counter- 
act an opposing force (in a task in which the direction of force and the direction 
of displacement are varied orthogonally) recordings from unit cells in the motor 
cortex are related to the amount of force needed rather than to the degree of 
displacement. Moreover, this activity in the motor cortex is manifest prior to 
evidence of muscular contraction. As Pribram (1971) points out, Evarts’ observa- 
tion suggests that the representation at the motor cortex is a mirror image of the 
field of external forces. But by our account this “image” must represent the 
action plan at a fairly late stage in its differentiation and, in terms of the earlier 
analogy with linguistic theory, is more closely related to the surface structure of 
art than to the deep structure. Indeed, Evarts (1973) has claimed recently that 
the representation of movement at the motor cortex, rather than identifying the 
highest level of motor integration (a classical point of view), is, on the contrary, 
much closer to the muscles and hence much lower in the organization of the 
action system than representations in other (traditionally lower) anatomical 
systems such as the cerebellum and the basal ganglia. A 

Accepting the proximity of this motor cortical representation to the act’s 
surface structure we can see that by this level the action heterarchy has 
transformed a projection of topological properties of the environment into a 
projection of environmental contingencies (e.g., forces). According to Bates 
(cited by Evarts, 1973) force is the logical output for the motor cortex; velocity 
is the single integral of this quantity and displacement is the double integral, and 
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both of these quantities are theoretically more difficult to specify than force 
itself. Yet ultimately acts call for accurate displacements, and accurate displace- 
ments, in turn, call for a projection of metrical properties of the environment. 
We are led, therefore, by this reasoning to another description of the evolving 
act, namely, that the action plan unfolds as a series of progessively less abstract 
projections of the environment. 


THE PROBLEM OF PRECISION AND THE CONCEPT OF TUNING 


The realization of an action plan as a coordinated pattern of motions requires its 
translation from the crude language of coordinative structures to the precise 
language of muscles. Commerce between animal and environment reduces ulti- 
mately to the regulation of pairs of antagonistic muscle groups coupled together 
at joints. In the translation from abstract action plan to mechanical response the 
alphamotorneuron stands as the penultimate component. The central question 
now is how can alphamotorneurons specify to muscles the needed lengths and 
tensions when these terms, length and tension, are not in the lexicons of higher 
domains and hence, by definition, cannot be ingredients in an action recipe. In 
short, we seek to understand more fully the mechanisms through which the 
action system generates precise commands to muscles from crude commands to 
coordinative structures. 

An instructive portrayal of the problem in limited form follows from the 
concluding comments of the preceding section—that a suitable output for the 
motor cortex is force. Suppose that subsequent processes, metaphorically speak- 
ing, integrate this quantity. Then, as already noted, the single integral will yield 
velocity and the double integral will yield displacement. But the particular 
velocity and the particular displacement obtained for any given force will 
depend on the end-points or limits of integration. Thus, specification of force 
alone is insufficient for the achievement of a desired velocity or displacement— 
the limits of integration must also be identified. How such “end points” might 
be supplied in the relating of action to environment is the kernel of our problem. 

We now proceed to consider and illustrate properties of the spinal cord that 
will aid our inquiry. Earlier we remarked that the role of higher levels of the 
nervous system is to pattern the interactions within and among coordinative 
structures. Let us now recognize that, in the main, coordinative structures have 
their origins in the spatially divided and relatively autonomous sybsystems of the 
spinal cord. And let us modify our terms slightly to read: the role of the higher 
levels of the nervous system is to modulate interactions within and among neural 
Ken at the spinal level (cf. Gelfand er al, 1971; Obituary: Tsetlin, 
1966). 


4This viewpoint is also expressed by students of motor control in insects (e.g., Rowell, 
1964; Weiss-Fogh, 1964; Wiersma, 1962). 
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The segmental apparatus of the spinal cord is a functional entity well suited to 
the organization of coordinated activity. Its component structures are richly 
interconnected by a variety of horizontal and vertical linkages, providing an 
intrinsic system of complex interactions which is no less essential for the 
evolving act than supraspinal influences. The spinal cord is an active apparatus 
that does not passively reproduce instructions from above (Gurfinkel, Kots, 
Krinskiy, Pal’tsev, Fel’dman, Tsetlin, & Shik, 1971) and, indeed, may regulate its 
degree of subordination to supraspinal mechanisms (Sverdlov & Maksimova, 
1965; Verber, Rodionov, & Shik, 1965). Several properties of spinal cord architec- 
ture and dynamics provide the basis for this interpretation. We take note of 
some of them here. First, of the great many interneurons in the spinal cord 
relatively few are afferent neurons. Second, interneurons rather than motor 
neurons are the terminal points for the majority of descending fibers from the 
brain. Third, the majority of synapses in the spinal cord are formed by connec- 
tions between spinal neurons, and relatively few are formed from axons coming 
from the brain and spinal ganglia. And fourth, reciprocal facilitiation and 
inhibition, and myotatic reflex action are all processes at the segmental level (see 
Gurfinkel et al., 1971). 

The integrity of the spinal cord rests on the fundamental servoprocess manifest 
in the alpha-gamma link regulating muscle contraction. Upon this servoprocess 
are built the intra- and intersegmental reflexes. We suppose that the modus 
operandi for integrating reflexes (the basis of the set of coordinative structures) 
in coordinated action exploits rather than disrupts the fundamental servomecha- 
nism. Indeed, this will prove to be the key notion for unraveling the problem of 
how precise instructions are formulated at the level of alphamotorneuron 
activity. 

But before examining the evidence for this view we observe that in the 
unfolding of the action plan on the segmental apparatus the responsibility for 
demarcating coordinative structures and for the parsing of those structures may 
devolve on separate neuroanatomical systems. Greene (1971b) cites a series of 
experiments by Goldberger in which the corticospinal and brainstem spinal 
paths in monkey were interrupted. With corticospinal interruption the animal 
can no longer inhibit unwanted components of a coordinative structure such as 
the group of muscle contractions that extend joints of the same limb. Thus, for 
example, when presented with food that he must stretch for the monkey reaches 
out with extended limb but cannot then close his fingers to grasp the food. If 
with the joints flexed the animal grasps food placed close to him and raises it to 
his mouth he cannot then let go. In contrast, brain-stem spinal interruption 
appears to impede the animal’s ability to restrict the evoked coordinative 
structures to those relevant to the task. When extending an arm to reach for 
food at a distance the group of contractions that rotate the limb, or those that 
raise the limb, may come into play in addition to the task-relevant group of limb 
extensors. 
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In short we see that the delimiting of coordinative structures and the manner 
of their decomposition are effected in the segmental apparatus by instructions 
from separate mechanisms. But now we must pass from this gross differentiation 
of the action plan at the segmental level to the finer differentiation afforded by 
the fundamental servomechanism (or, more aptly, the fundamental coordinative 
structure). 

The main body of a muscle consists of extrafusal fibers that on contraction 
alter the relative positions of the bones to which they are attached. The 
innervation of extrafusal fibers is supplied by alphamotorneurons. Within the 
main body of the muscle are intrafusal fibers which are wrapped around the 
middle by the terminals of sensory fibers. These sensory fibers and the intrafusal 
muscle fibers to which they attach are referred to collectively as a muscle 
spindle. Muscle spindles connect to the extrafusal fibers at one end and to a 
tendon at the other and are therefore “in parallel” with the extrafusal fibers. 
Two functionally distinct spindle components can be identified: a static com- 
ponent that is sensitive to the instantaneous muscle length and a dynamic 
component that is sensitive to the rate of change of muscle length (Matthews, 
1964). On contraction of the intrafusal fibers, the spindle receptors register the 
difference in length and the difference in rate of change of length between the 
intrafusal and extrafusal fibers. The induced receptor excitation is com- 
municated to the linked alphamotorneurons which respond by recruiting more 
extrafusal fibers until the discrepencies in length and velocity have been an- 
nulled, Thus, in a situation in which a load is applied to a muscle extending it 
beyond its resting length, the spindle feedback provides an autonomous means 
of tailoring the muscle response to the new conditions. This negative feedback 
system identifies the fundamental servomechanism; it is now incumbent upon us 
to show that this servomechanism is biasable in ways of considerable importance 
to the theory of action. 

Intrafusal fibers, like extrafusal fibers, have a source of innervation, the 
gammamotorneurons. These motorneurons fall into two relatively independent 
classes, the gammastatic and the gammadynamic, regulating respectively the 
static and dynamic components of muscle spindles (Matthews, 1964). Again, 
gammamotorneurons, like alphamotorneurons, are under high-level control but 
the motor nerves that project from brain to gammamotorneurons and those that 
project from brain to alphamotorneurons are largely separate and thus it is 
optional whether the spindles and the main body of a muscle contract and relax 
together. “The spindles could therefore be activated while the main muscle 
remained passive, and vice versa” (Merton, 1973, p. 37). 

We see therefore that the gamma system allows for the modulation of the 
fundamental servomechanism, The gamma-static motorneurons can control the 
equilibrium state of the servomechanisms while the gamma-dynamic motor- 
neurons can control the “damping” of the servomechanism, that is, the rate at 
which it achieves equilibrium. Thus, the servomechanism is not only informed of 
what it has done, but more importantly it can be informed of what it must do. 
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This completes the elementary description of the biasable nature of the 
fundamental servomechanism; but one or two points remain to be considered. 

In addition to the biasable feedback loop signaling length and velocity through 
spindle receptors, there is another which signals muscular force through tendon 
organs. The signals conveying force feedback coverge on interneurons on their 
way to alphamotorneurons. As before, interneurons can be manipulated by 
higher-level instructions so that the inhibitory effects of force feedback on 
alphamotorneuron activity can be modulated. The biasable feedback loops 
conveying length, velocity, and force are inextricably linked in the regulation of 
the servomechanism. So we may expand on our comments above. While higher 
level control signals to alphamotorneurons set the servomechanism going, the 
higher level control signals to the gamma system and to the interneurons 
transmitting force information from the tendon organs function less as instiga- 
tors of movement than as modulators of the gain of the feedback loops, that is 
to say, they serve to adjust the ratios of the outputs of feedback loops to their 
inputs. This principle of higher-level modulation of spinal reflexes is generalized 
to the segmental mechanism of reciprocal inhibition by which we understand 
that spindle activity not only impinges upon an alphamotorneuron of its own 
muscle but also via inhibitory motorneurons upon an alphamotorneuron of the 
antagonist muscle. Spindle output contributes both to agonist contraction and 
to antagonist relaxation in the regulation of pairs of muscles controlling a joint. 
Clearly, from all that has been said the reflex interplay between agonist and 
antagonist is biasable. 

We are now in a position to identify the property of the spinal cord that is 
central to our current concerns (and for which we will shortly provide evidence), 
namely, that the system of segmental interactions is biasable and as a general 
strategy the activating of coordinative structures occurs against a background of 
spinal mechanisms already prejudiced toward executive intentions. Thus, it may 
be argued that the control of movement is in many respects the reorganization 
or tuning of the system of segmental interactions and that this attunement 
precedes the transmission of activating instructions to coordinative structures 
(Gelfand et al., 1971; Gurfinkel et al., 1971). 

Before we pass from this elementary discussion of tuning to take up the topic 
in earnest let us glance at two examples of how alpha-gamma linkage might 
embellish a relatively simple instruction. For the first, consider the previously 
discussed action of stepping, recognizing the fact that a double joint flexor of 
the hip also produces extension at the knee. The high-level representation of 
stepping can be said to specify crudely a general “flexor plan” for the limb 
(Lundberg, 1969). The knee flexors innervated on this plan are strong enough in 
the early phases of the movement to prevent extension but as flexion proceeds 
the double-joint muscle will become stretched inducing an intense discharge of 
spindle activity. The spindle feedback will impinge upon hip-flexor alphamotor- 
neurons and also produce reciprocal inhibition of the motor units belonging to 
knee flexors. Thus during the swing the knee flexors originally innervated on the 
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flexor plan will suffer inhibition from the spindle activity of the double-joint 
muscle. The upshot of this interplay is the differentiation of the broadly stated 
flexor plan into coordinate stepping (Lundberg, 1969). 

For the second example, consider a sudden change in the loading on an 
outstretched arm: a heavy object is placed into the hand, in a task where 
the arm’s inclination to the ground is to be kept constant. We may suppose that 
instructions to the appropriate coordinative structures quickly bring the arm 
back into a close approximation to the desired position, with spindle feedback 
coming into play to finely tune the terminal point of the trajectory (cf. Arbib, 
1972; Navas & Stark, 1968) and to maintain the arm in its desired position 
under the new conditions. In this sense we construe the alpha and gamma 
systems as participating in a “mixed ballistic-tracking strategy” (Arbib, 1972, p. 
134) with the alpha system determining the ballistic component which gets the 
limb quickly into the right ballpark and the gamma system determining the 
superimposed tracking component which supplies the needed refinements. 

From what has been said about the segmental apparatus and its pretuning we 
understand that the alpha-gamma processes of the two examples cited above do 
not take place in a vacuum. Rather, they occur against a backdrop suitably 
colored by supraspinal influences. It can be demonstrated that in the final 30 
msec or so preceding a movement there is a pronounced enhancement of the 
effect of reciprocal inhibitation on the future antagonist of that movement 
(Kots & Zhukov, 1971, see below). What this suggests is that a motion like 
stepping or raising the arm is anticipated through the supraspinal tuning of the 
segmental mechanisms of reciprocal inhibition. We should also recognize that the 
backdrop for a voluntary act is not limited to adjustments in spinal reflexes. 
Thus, for example, when an arm is raised voluntarily by a person standing 
upright it is possible to observe in the period immediately prior to the first signs 
of arm muscle activity, anticipatory activity in a number of muscles of the lower 
limb and trunk (Belen’kii, Gurfinkel, & Pal’tsev, 1967). Figuratively speaking, 
when one moves an arm in a standing position one first performs “movements” 
with the legs and the trunk and only then with the arm. 

In summary, we have considered in preliminary but sufficient fashion the 
kinds of mechanisms that effect significant variation in the behavior of low-level 
holons without infringing upon their autonomy. These mechanisms suitably 
controlled from higher domains allow for the precise regulation of muscular 


contraction. The gist of the whole matter is given in a short paragraph by 
Pribram (1971): 


When reflexes become integrated by central nervous system activity into more complex 
movements, integration cannot be effected by sending patterns of signals directly and 
exclusively to contractile muscles, playing on them as if they were a keyboard. Such 
signals would only disrupt the servoprocess. In order to prevent disruption, patterns of 
signals must be transmitted to the muscle receptors, either exclusively or in concert with 
those reaching muscle fibers directly. Integrated movement is thus largely dependent on 
changing the bias, the setting of muscle receptors [p. 225]. 
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To this we need only add that the modulation of interneuronal pools must also 
play a significant part in the biasing of servomechanisms. 


MOVEMENT-RELATED SEGMENTAL PRETUNING 


Our task now is to view evidence for segmental pretuning in voluntary move- 
ments. But before we do so we must acknowledge that the available evidence 
comes from experiments in which the performers on cue are required to flex a 
knee, bend an elbow, extend a foot, and in general to execute movements whose 
trajectories, velocities, degrees of displacement, etc., are indifferent to the 
environment. To my knowledge there are no experiments on segmental pretun- 
ing for acts that depend on the detection of environmental properties for their 
performance. As a precautionary measure, therefore, we will distinguish between 
movement-related and environment-related pretuning. The former will refer to 
the changes in the segmental apparatus preceding the execution of a simple 
voluntary motion that is unrelated to the environment, in the sense that neither 
the actor’s position with respect to the environment nor the position or orienta- 
tion of objects with respect to each other and to the actor are altered by the 
motion. The latter, environment-related pretuning, will refer to segmental 
changes that precede actions or, more precisely, components of-actions which 
are environmentally projected, in that their purpose is to displace the actor with 
respect to the environment or to displace (or rotate, or reflect) objects with 
respect to the actor, or both. 

The goal we are approaching slowly is that of understanding, in a rough and 
approximate way, how seeing enters into doing. To this end we will need to 
extrapolate from movement-related pretuning to a general picture of how 
environmental properties control action. At all events, our immediate concern is 
with movement-related pretuning and we begin with some general comments. 

The methodology for investigating movement-related pretuning has much in 
common with the methodology that characterizes the information-processing 
approach to visual perception (cf. Haber, 1969) which seeks to determine how 
visual “information” is modified in the course of its flow in the nervous system. 
Thus, techniques of masking, delayed partial-sampling, and reaction time are 
used to assess the correlation between stimulus and response at varying delays 
after visual stimulation. 

In a similar if less sophisticated vein the tuning experiments that we will 
consider in this section judiciously apply the principle of probing the nervous 
system, in particular the spinal cord, in the interval elapsing between a warning 
signal and a cue to respond, or (more especially) between a cue to respond and 
the first signs of activity in the agonists executing the motion. The probes are 
simple reflexes elicited during the interval, with the latency and amplitude of the 
reflexes (recorded by the electrical response of the corresponding muscles) taken 
as indicants of the state of the segmental apparatus prior to the movement. The 
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reflexes used for this purpose have generally been tendon reflexes elicited by a 
tap and the Hoffman or H reflex, which is a monosynaptic reflex in the 
gastrocnemius-soleus muscle group and elicited by electrical stimulation of the 
tibial nerve in the popliteal fossa (Hoffman, 1922). 

As an introduction to the procedure and to the observations that will be of 
interest to us we consider two exemplary experiments. In the first (Gurfinkel er 
al., 1971) the participant is seated with legs flexed and on command extends one 
leg, the responding leg remaining constant throughout the experiment. Surface 
electromyographic (EMG) recording of activity in the quadriceps femoris muscle 
reveals that the tibia extension occurs at a latency of 160-180 msec. If the 
patellar reflex is evoked in the same leg within 100 msec or so of the cue to 
respond the amplitude of the reflex is unaffected (compared to the control 
condition in which no command to extend is given). However, if the patellar 
reflex is elicited beyond this period then its amplitude is enhanced the closer to 
the command that it is elicited. We infer, therefore, that the state of the 
segmental apparatus has been altered prior to activation of the muscle group 
extending the knee. In the second example (Gottlieb, Agarwal, & Stark, 1970), 
the participant is again seated normally but with the right leg extended, knee 
slightly flexed, and the foot firmly strapped to a plate to which he transmits an 
isometric force through either plantar flexion or dorsiflexion. The task of the 
participant is to match the level of his foot torque with a target level specified 
by the experimenter in what is, essentially, a continuous tracking task in which 
the varying target level of torque and the level of the participant’s matching 
torque are displayed on a scope viewed by the participant. The H reflex is 
elicited at different delays subsequent to the target adopting a new level, and 
both the H reflex and the activity in the gastrocnemius-soleus muscle group 
(agonist in plantar flexion) and the anterior tibial muscle (agonist in dorsi- 
flexion) are measured. Summarized briefly, the results are that the amplitude of 
the H reflex is distinctly augmented if elicited in the period 60 msec prior to the 
initial signs of voluntary motor unit activation in plantar flexion and is generally 
inhibited in approximately the same interval before the first signs of voluntary 
dorsiflexion. Again we infer that there are changes in the spinal cord that precede 
agonist activation. 

How specific is segmental pretuning? Consider initially some further experi- 
ments reported by Gurfinkel et al. (1971). In one, two movements are executed 
by the subject on separate occasions: flexing the leg at the hip joint and 
extending the knee. Measures are taken of the tendon reflex of the rectus head 
of the quadriceps femoris muscle, which spans two joints, and of the lateral head 
of the same muscle, which spans only one joint. When the hip is flexed a 
premovement increase is observed only in the amplitude of the tendon reflex 
elicited by stimulation of the rectus head. But it is important to note here that 
while the rectus head of the quadriceps femoris is involved in hip flexion the 
lateral head is not. By contrast, when extension of the knee is called for, a 
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movement that involves both heads of the quadriceps femoris, both reflexes are 
significantly increased in amplitude prior to signs of voluntary motor neuron 
activity. In another experiment Gurfinkel et al. (1971) observed that in the 
70-80 msec prior to flexing the leg at the knee the patellar reflex is amplified, 
but they also observed that the patellar reflex is amplified (although not to the 
same degree) prior to flexing the elbow of the arm ipsilateral to the leg in which 
the patellar reflex is elicited. We may conclude therefore that both specific and 
nonspecific changes in the segmental apparatus of the spinal cord precede 
voluntary motor unit activation. 

This conclusion is buttressed by some other experiments that examine the 
changes in the spinal cord during the period intervening between a warning signal 
and the signal to execute a given movement. We take, as examples, experiments 
reported by Requin and Paillard (1971) and by Requin, Bonnet, and Granjon, 
(1968). In these experiments the movement is extension of the foot (plantar 
flexion), and both tendon and H reflexes are recorded from both the participat- 
ing leg and the nonparticipating leg. Following the warning signal there is 
evidence of an increase in the amplitude of the reflexes measured in both legs, an 
increase which persists for the reflexes of the nonparticipating leg but is 
progressively depressed in the participating leg with greater proximity to the cue 
to respond. In short, these experiments provide evidence that after a warning 
signal and before a signal to execute a movement there occur both a nonspecific 
change in spinal sensitivity and a specific change related to the motorneuron 
pool—the servomechanisms—about to be involved in the forthcoming movement. 
In the view of Requin and his colleagues the depression of the reflex amplitude 
in the participating leg is due, under the conditions of the warning signal 
procedure, to central (supraspinal) influences that selectively “protect” the 
direct participants in the movement from irrelevant influences exerted upon 
them prior to their activation. 

Evidently in the premovement period specific changes occur in the feedback 
loops related directly to agonist regulation. But can we demonstrate similar 
effects in the more extended feedback loop relating the state of an agonist to its 
antagonist? Two experiments by Kots (Kots, 1969; Kots & Zhukov, 1971) 
provide an answer. 

Kots (1969a) wanted to know whether the enhancement in the H-reflex 
excitability of the motor neurons of the gastrocnemius evidenced in the latent 
period of a voluntary movement depended on the role the gastrocnemius was to 
play in the forthcoming movement. To this end the H-reflex amplitude was 
measured in the gastrocnemius when it was the future agonist of the movement, 
that is, in plantar flexion, and when it was the future antagonist of the 
movement, that is, in dorsiflexion. It was observed that following a command to 
move, the amplitude of the H reflex was significantly enhanced in the period 
beginning 60 msec prior to the first signs of voluntary motor unit activation only 
when the gastrocnemius was future agonist. When the antagonist role was 
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assumed, the H reflex was neither enhanced nor depressed in the latent period 
but was found to decline sharply immediately following the first myogram signs 
of motor unit activity in the agonist muscle, the anterior tibial. 

It would appear, therefore, that the effect of tuning was specific to agonistic 
activity and that the failure to detect a depression in the H reflex when the 
gastrocnemius was the antagonist suggests that the “positive” priming of agonist 
centers is not paralleled by a “negative” priming of antagonist centers. The 
absence of antagonist depression in Kots’ (1969a) experiment contrasts with the 
evidence of such depression in the experiment of Gottlieb er al. (1970), de- 
seribed above. While not stating it explicitly, the report of the latter experiment 
implies that plantar flexion and dorsiflexion were mixed randomly in the course 
of an experimental session; in Kots’ experiment the two response modes were 
examined separately and this difference in procedure may account for the 
difference in results. At all events, taken together, the two investigations suggest 
that depressing the motor neurons of antagonistic muscles in the latent period of 
voluntary movement is not a necessary concommitant of tuning agonists. How- 
ever, as we shall see in the second experiment of Kots (Kots & Zhukov, 1971), 
there is indeed an adjustment made in the inhibitory influences on the antago- 
nists of a movement during the latent period that is not manifest as a depression 
in motor neuron excitability. 

The Kots and Zhukov (1971) experiment made use of paired stimulation, a 
procedure that is comparable to the forward masking procedure commonly used 
in visual information-processing experiments (e.g., Turvey, 1973); essentially, a 
leading stimulus is used to impede the response to a lagging stimulus. For Kots 
and Zhukov the leading member of the stimulus pair was electrical stimulation 
of the peroneal nerve and the lagging member was electrical stimulation of the 
tibial nerve. Peroneal nerve stimulation elicits a direct response (the M response) 
in the motor neurons of the anterior tibial muscle without accompanying M and 
H responses in the gastrocnemius—soleus muscle group. Tibial nerve stimulation, 
as we have already seen, elicits the monosynaptic H reflex of the gastrocnemius— 
soleus group. The H response in the gastrocnemius—soleus group is significantly 
depressed when elicited very shortly after peroneal stimulation, say 2-4 msec. 
The brief latency of this effect implies that it is realized by the “spinal apparatus 
of reciprocal inhibition” (Kots & Zhukov, 1971). We can therefore exploit this 
paired-stimulation procedure to monitor the state of reciprocal inhibition mech- 
anisms during the latent period of a voluntary movement. Thus, Kots and 
Zhukov sought to determine whether the impairment in the H reflex induced by 
prior peroneal stimulation was intensified in the latent period of voluntary 
dorsiflexion. In more general terms, they sought to determine whether there is 
pretuning of the mechanisms of reciprocal inhibition. During dorsiflexion, re- 
ciprocal inhibition would protect the anterior tibial muscle from the antagonistic 
response of the gastrocnemius-soleus muscles; Kots and Zhukov looked to see if 
this mechanism was primed for its task before voluntary activation of the 
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anterior tibial motor neurons. The experiment showed that in the final 30 msec 
prior to dorsiflexion the paired-stimulation effect was significantly enhanced 
and, moreover, that this enhancement could not be due to a reduction in 
excitability of the motor neurons of the gastrocnemius-soleus group, since the H 
response in the absence of preceding peroneal stimulation was unaltered during 
the latent period. Of course, this is what Kots (1969a) had found before. 

Collectively, the experiments we have discussed suggest that a profound 
reorganization of the spinal cord precedes movement. There are both nonspe- 
cific and specific components of this reorganization, and the latter have been 
shown to include the mechanisms of reciprocal inhibition in addition to the 
servomechanisms regulating agonist activity. Moreover, the reorganization of the 
interaction among neural mechanisms at the spinal level follows the pattern of 
initially diffuse becoming more localized the closer in time to the manifestation 
of the desired movement (Gelfand et al., 1971). It is evident, therefore, that in 
the differentiation of an action plan the realization of instructions to coordina- 
tive structures is determined by the state of these structures on receipt of the 
instructions (cf. Gurfinkel & Pal’tsev, 1965).6 Since the argument is that 
coordinative structures when activated perform in a relatively autonomous 
fashion, it follows that the details of their performance are very much deter- 
mined by the state of the segmental apparatus at the time of activation. 


TUNING AS PARAMETER SPECIFICATION 


The elegance of tuning as a control process is that it permits the regulation of an 
autonomous system without disrupting the system’s autonomy. So far we have 
illustrated this principle only in the control of servomechanisms at the level of 
muscle contractions. But this should not blind us to the likelihood of tuning as a 
general principle fundamental to all domains of the action heterarchy. We recall 
the comment that actions are produced by fitting together substructures, each of 


SWhile we have chosen to discuss only experiments using simple, single movements we 
should note that other experiments have examined patterns of segmental pretuning in the 
performance of sequential and rhythmic movements (e.g. Kots, 1969; Surguladze, 1972). 
Thus, Kots (1969b) showed that in the sequential performance of two movements opposite 
in direction in the ankle joint the segmental organization for the second movement is 
realized during the execution of the first. nel 

To demonstrate this point Gurfinkel and Pal’tsev (1965) examined the effect of eliciting 
a tendon reflex subsequent to a cue for voluntary movement (extension of the knee) on the 
latency of the voluntary movement. They found that the latent period of voluntary ex- 
tension was linearly dependent on the time at which the reflex was elicited. In addition 
they showed that this effect held even when the reflex was elicited in the leg contralateral to 
that executing the voluntary leg extension. It is assumed that the reflex induces a change in 
the segmental apparatus and that the realization of commands for leg extension is therefore 
dependent on the prevailing state of the system of spinal relations. 
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which deals relatively autonomously with a limited aspect of the action problem. 
In addition, we recall that each domain may be construed as a representation in 
which relations are defined on a set of autonomous structures with the size of 
these structures becoming progressively smaller and their number progressively 
larger as the action plan is mapped into progressively less abstract representa- 
tions. We now hypothesize that into each representation tuning functions may 
enter as modulators of coordinative structures. 

Miller, Galanter, and Pribram (1960), discussing the acquisition of typing, 
suggest that the student typist learns to put feedback loops around larger and 
larger segments of her behavior. We might well suppose that this notion applies 
beyond typing to skilled acts in general, and with internal feedback loops in 
addition to the more commonly understood forms of feedback. We can imagine 
tuning functions of a more abstract kind related to the modulation of feedback 
between action segments that collectively behave as relatively autonomous units 
in the performance of any given skilled behavior. Again tuning would permit 
appropriate variation without disrupting, in these instances, the acquired self- 
regulating procedures. 

To extend our usage of tuning, we will adopt a most important and proyoca- 
tive hypothesis, namely, that tunings parameterize the equivalence classes of 
functions specified by executive procedures. This follows from Greene’s (197 1a, 
b) contention that the smallest units of information available to an executive are 
probably not functions but families of functions parameterized by possible 
tunings. 

Although we are here attempting to pass beyond the idea of tuning limited to 
the fundamental servomechanism we may profitably exploit our earlier discus- 
sion of that mechanism to illustrate the notion of tuning as parameter specifica- 
tion. We take as our departure point the experimental and mathematical analysis 
supplied by Asatryan and Fel’dman (1965) and Fel’dman (1966a, b) of the 
maintenance of joint posture and of the simple voluntary movement needed to 
achieve a desired angle of joint articulation. Consider a simple mass-spring 
system defined by the equation F = So(l — Xo) where F is the force, So is the 
stiffness of the spring, / is the length of the spring and ào is the steady-state 
length of the spring, that is, the length at which the force developed by the 
spring is zero. This simple mass-spring system is controllable to the degree that 
the parameters Sp and X, are adjustable. Changing A, with Sp constant generates 
a set of nonintersecting characteristic functions, F() = -So (/— X), and changing 
both parameters generates a set of functions F(/) = —S(/ — A) that will pass 
through all points in the plane defined by the cartesian product F X I. 

Let us now suppose that a joint-muscle system is analogous to our simple 
mass-spring system, in which case we can argue that the problem of controlling a 
joint-muscle system reduces to that of fixing certain characteristics of the 
system, that is, of setting mechanical parameters of the muscles, or more 
precisely, of setting biases on the fundamental servomechanisms. In the analogy 
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the characteristic functions of a joint-muscle system are of the form M($) where 
M is the total muscular moment and ¢ is the joint angle. And each M(b), 
therefore, is determined by the mechanical parameters of the muscles regulating 
the joint: 


A= (Ay, Ars. o An), S = (S1, S2,- - Sn), 


where n is the number of muscles. 

Given the foregoing comments let us now consider experiments using the 
technique of partial unloading—a technique (which we will shortly describe) that 
would assay the characteristic properties of an ordinary spring. These experi- 
ments (Asatryan & Feldman, 1965) sought to demonstrate that for a given 
situation the variations of muscular moment as a function of joint angle (or vice 
versa) are defined by an initial setting of the parameters, that is, by a character- 
istic function. For purposes of analysis we will refer to the state of the 
joint-muscle system as a, where a is defined by the vector (M, $). When M and ® 
are constant for some period of time then a is a steady state of the system. 

The experimental methodology may be described briefly. The participant’s 
forearm is fixed on a horizontal platform whose axis of rotation coincides with 
the axis of flexion and extension of the forearm. The horizontal platform is 
attached to a simple pulley system supporting a set of weights which can be 
selectively unloaded. At the outset of a trial the participant establishes a steady 
state, a,, of the joint-muscle system: given a specified angle of articulation the 
participant must establish a muscular moment to compensate for the effect of 
the moment of external forces—determined by the weights and their direction of 
pull on the horizontal platform—opposing flexion (or extension) of the joint. 
Thus for a standard initial opposing force, different steady states, a,, can be 
established for different joint angles, és- Once the steady state is established at a 
given d, the participant is then asked to close his eyes and the weight is 
unloaded, with the amount of unloading varying across trials. The new angle of 
articulation—the new steady-state aş —to which the arm briefly moves following 
unloading (and before the participant can make compensatory adjustments) is 
recorded. From a series of experiments such as the one we have described 
Asatryan and Fel’dman (1965) demonstrated that for all possible initial states as 
of the joint-muscle system a set of nonintersecting functions, Ms($) (or bM) ), 
are generated relating muscular moments to the new steady-state angles of the 
joint. Moreover they showed that the form of the function M,(¢) does not 
depend on the external moments but is determined unambiguously by the 
parameters of the initial state of the system. (This was demonstrated by using a 
set of external moments that were rising functions of joint angles, and a set that 
were diminishing functions of joint angles.) So we conclude that the function 
M,(¢) for each a, is an invariant characteristic of the joint-muscle system: If the 
system is perturbed it will follow a trajectory of states leading to a new state of 
equilibrium, where both the trajectory and the equilibrium state are defined by 
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the parameters fixed in the initial steady-state ay. And since the curves are 
nonintersecting, the transition from one M($) to another requires changing X; 
with little, but preferably no change, in S; (Asatryan & Fel’dman, 1965). It 
would seem, therefore, that a joint—muscle system does behave like a spring, that 
is, like a vibratory system, and that the action structures can choose parameters 
for this “spring” in accordance with the prevailing conditions. For a brief period 
of time following perturbation, until new parameters of the spring can be 
specified, the joint-muscle system behaves in the way that we would expect the 
chosen “spring” to behave. 

We now proceed to develop this theme through the experiments of Fel’dman 
(1966b). These were conducted with a slight variation on the apparatus de- 
scribed above. The pulley-weight system was replaced by a detachable spring 
that opposes flexion of the joint but is insufficiently taut to prevent flexion. At 
rest the joint is flexed at an angle do, and on the occurrence of an auditory cue 
the participant must establish as rapidly as possible and without the aid of vision 
the steady angle ¢,. (The participant is given a practice session so that he can 
achieve $, with a minimum of error). During a series of trials the spring is 
occasionally detached within the period subsequent to the auditory cue and 
prior to movement. Now suppose that at the outset of a trial a fixed invariant 
characteristic M($) has been determined for the attainment of a steady- 
state a, corresponding to the desired angle ¢,. In the steady-state a,, ¢ = 
6, and M = Me, where Me is the moment of force provided by the spring 
attached to the platform. But when the spring is detached, a new steady-state 
aı' = (0, 1) is required to achieve the same angle of articulation ¢,, which 
means, of course, that a new invariant characteristic M’(¢) is needed. The 
question is: Can the transition from one invariant characteristic to another be 
effected during the execution of the movement? If it cannot, then when the 
spring is detached the joint will move to the angle œ) determined by the 
characteristic function M($). In the space (M, $), #2 will be at the intersection of 
M(¢) with M =0 (since Me = 0). The results of the experiment reveal that during 
the rapid establishment of a desired steady angle in the joint, correction of the 
invariant characteristics of the joint-muscle system (correction of the param- 
eters defining the projected steady state) does not occur. The correction is made 
only after the achievement of the new steady state (corresponding to #2) when 
the error becomes obvious. 

Now we wish to prove that the joint-muscle system truly behaves in this 
situation like a mass-spring system; although the movement of such a system as a 
whole is determined by the initial conditions, the equilibrium position does not 
depend on them and is determined only by the parameters of the spring and the 
size of the load. Using the paradigm described above we attach a pulley-weight 
system opposing extension such that release of the weight induces passive 
extension in the joint. Thus there are two external moments operating on the 
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limb: a spring-opposing flexion and a weight-opposing extension. The participant 
becomes acquainted with the situation in which the spring is detached in the 
latent period before movement to the intended angle ¢ı , bringing about passive 
flexion of the joint. But on some occasions the weight is also detached, leading 
additionally to passive extension before the voluntary movement. The results 
show that these rather radical and unpredictable changes in the initial conditions 
do not alter the behavior of the joint-muscle system: The trajectory of the 
system is still determined by the initial setting of parameters, that is, it moves to 
the state defined by the characteristic function M($) established at the outset of 
the trial. In brief, the equilibrium position is independent of the initial condi- 
tions (Fel’dman, 1966b). 

In further analyses, this time of rhythmic movements of the joints, Fel’dman 
(1966b) was able to demonstrate that there is an independent parameter setting 
for the dynamics of the joint-muscle system. We may, therefore, envisage the set 
of fundamental servomechanisms (the alpha-gamma links together with the 
tendon feedback loops) regulating joint flexion and extension as collected 
together into a single vibratory system for which “static” and “dynamic” 
parameters can be specified. Choice of static parameters for the system deter- 
mines the aim of a movement (the final steady state) independently of initial 
conditions; choice of the dynamic parameters determines (to a large extent) the 
rate and acceleration of the movement and also its form (aperiodic, oscillatory, 
etc.) (Fel’dman, 1966b). 

This analogy between systems controlling action and vibratory systems sug- 
gests that we may usefully conceive of coordinative structures in general as 
biasable, self-regulating vibratory systems. In their simplest forms such systems 
might be modeled by the following second-order homogeneous linear differential 
equation with constant coefficients: 


mX" (£) + kX'(0) + sX (£) = 0, 


where X(f) is the function relating the displacement of the system from a steady 
state to time.” In such a system the setting of the parameter s defines the 
“stiffness” of the system and thus its equilibrium state, and the setting of the 
parameter k defines the friction or damping constant which determines the rate 
at which the system achieves equilibrium and the form of its behavior, that is, 
whether it oscillates or not. By way of summary, we have seen that the 
functional tuning of the segmental apparatus of the spinal cord may be likened 
to the specification of the parameters $ and k for vibratory systems. On the 


7 This simple linear differential equation is given only to illustrate a principle. It is not 
meant to model (although it might) an actual coordinative structure. If we were to make the 
illustration more realistic and more general we would need to consider forced vibration in 
addition to free vibration, and to concern ourselves with equations in which the applied 
force varied with time or acted in an arbitrarily short interval. 
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assumption that all coordinative structures behave as vibratory systems, then 
tuning as parameter specification emerges as a viable procedure for adjusting the 
behavior of selected coordinative structures at all levels of abstraction of the 
action heterarchy. Thus, while some coordinative structures coordinate auton- 
omously a greater number of pieces of the action apparatus than other coordina- 
tive structures (compare, for example, two classes of basic coordinative struc- 
tures, the long spinal reflexes and the flexion reflexes), the manner of their 
attunement is fundamentally the same. 

We now address the important question of whether the tuning and activation 
of autonomous systems are governed by the same mechanisms. Again, we will 
proceed on the assumption that the regulatory principles for large systems 
follow very much the pattern of small systems. This permits us the latitude of 
extrapolating from the tuning of small systems, that is, the fundamental servo- 
mechanism, about which we know something, to large systems, about which we 
know very little. The evidence of segmental pretuning suggests, among other 
things, that the nervous system has available a means of selectively raising and 
lowering the gain of spindle and tendon organ feedback loops. Indeed, the 
comment was made earlier that the control of the alpha and gamma systems is 
largely separate so that it is optional whether the two systems be concurrently 
active. But in the experiments we have taken as evidence for segmental pretun- 
ing, can a case be made for the selective modulation of servoprocesses indepen- 
dent of instructions sent specifically to activate alphamotorneurons, either 
directly or indirectly through gammamotorneurons? In experiments exploiting 
the H reflex and plantar flexion, such as those of Gottlieb et al. (1970), we 
might suppose that changes in the reflex during the latent period reflect nothing 
more than the increasing excitability of gastrocnemius-soleus motor units 
brought about by direct supraspinal signals to the alphamotorneurons. Or, in a 
similar vein, the increase in the H reflex represents the increased excitability in 
the alphamotorneuron pool of the gastrocnemius-soleus group in response to 
stimulation from the gamma system, which is in turn responding to directions 
from above. In these accounts the variation in the reflex is not an independent 
event but an epiphenomenon of alpha system innervation; that is to say, the 
voluntary EMG and the H-wave variations are manifestations of the same 
controlling input. Against this argument, however, Gottlieb er al. (1970) point 
out that changes in the wave form and amplitude of the H reflex are not 
correlated with changes in the agonist or antagonist EMG and, in addition, that 
the time courses of the recordings are clearly different. From their point of view 
it is much simpler to propose that for their particular form of voluntary 
movement there is a means for modulating the H reflex (and by inference, the 
fundamental servomechanism) that is separate from the means for activating 
alphamotorneurons, In more general terms we may conjecture that the mecha- 
nisms of tuning and activating coordinative structures are largely separate. 
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THE RELATIONSHIP BETWEEN THE EXECUTIVE AND TUNING 


Let us summarize briefly our thinking thus far. The executively specified action 
plan identifies the relevant subset of coordinative structures and a set of 
functions on that subset (identifying the necessary equivalence classes) that will 
modulate its elements and relate them in a certain fashion. In the course of 
spelling out the action plan through successive procedures within the action 
heterarchy the functions identified by the executive may be substituted for by 
functions more suited to the current low-level conditions of the system. The 
interconverting of functions, however, leaves the equivalence classes invariant. 
Of these interconversions and of the low-level realization of the details of the 
action plan, the executive remains virtually ignorant. 

The eventual activation of coordinative structures takes place against a back- 
ground of prearranged interactions within the segmental apparatus of the spinal 
cord. We say that the segmental apparatus has been pretuned, or simply, tuned, 
and that the detailed performance of coordinative structures is determined by 
the extent interactive state of the segmental apparatus. The tuning of coordina- 
tive structures and the activation of coordinative structures appear to be gov- 
erned by separate mechanisms. 

We now ask: If it is the case that the activation and tuning of coordinative 
structures are separately controlled events, at what level is the separation first 
evident? More precisely, we are keenly interested in the issue of whether tuning 
is the responsibility of the executive, and thus part of the initial representation 
of the act, or whether this responsibility lies outside the executive’s domain. 

For a given movement such as plantar flexion we may suppose that the 
executive specifies a tuning function to the servomechanisms for the (possibly) 
separate alpha and gamma instructions to follow. The independence of move- 
ment-related tuning would arise, on this account, because the tuning function is 
effected by substructures different from those responsible for motor neuron 
activation, much along the lines that the delimiting of coordinative structures 
and their decomposition are controlled separately. In this view the family of 
possible tunings defines just another equivalence class, another invariant unit of 
information for the executive specification of solutions to action problems. 

Alternatively, we may propose that segmental tuning is not specified in the 
action plan but is determined by other structures on acknowledgment of the 
executive’s intention (cf. Greene 1971a, b). There would be special advantages 
accruing to a devolution of responsibility for specifying action plans and 
segmental tuning, advantages that would be especially pronounced when actions 
are related to environmental events. For example, it would mean that the 
executive could develop a repertoire of plans appropriate to frequently occurring 
classes of environmental events, so that when confronted with an event of a 
certain class the executive issues a standard set of instructions and leaves to 
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relatively independent tuning systems the responsibility for achieving the appro- 
priate variant. Indeed, the largely invariant species-specific behavior of animals 
documented in the now celebrated works of ethologists (e.g., Tinbergen, 1951) 
strongly suggests that evolution has thoroughly exploited the principle of sepa- 
rating action-plan specification from tuning. The instinctive rituals are released 
by stimulation of a simple kind—the red belly of the stickleback, the spot under 
the herring gull’s beak—but the unfolding stereotypic behavior is flexible: it 
relates to the lie of the land, to the contingencies of the local environment. We 
should suppose that these species-specific action plans are adjusted by the 
pickup of information about the environment, that is to say, their tuning is 
environment related. 


Visual Control of Locomotion 


Locomotion provides an instructive example of this viewpoint, for although 
locomotion propels an animal through its cluttered, textured environment, the 
basic locomotion pattern generator is independent of local conditions (Evarts et 
al., 1971). The necessary adaptive modifications are effected by feedback from 
the peripheral motor apparatus (the muscles and the joints), from changes in 
tactual motion, from the basic orienting system (Gibson, 1966b), and most 
significantly, from the perceptual pickup of information about surfaces and 
objects, about the relations among them and the moving animal. Visually 
detected information about the environment plays a fundamental role in permit- 
ting anticipatory changes in the basic locomotion pattern through “feed for- 
ward”; appropriate changes in coordination may be induced before the animal 
confronts a certain kind of surface irregularity or a certain kind of object. To 
manipulate the locomotion plan by touch or kinesthetic feedback alone would 
be unsatisfactory, since this form of regulation would often occur after an 
ill-adjusted movement and thus would specify compensatory changes for states 
that are no longer current. It is far better to have the low-level realization of the 
plan adjusted beforehand through patterns of feed forward related to properties 
of the optic array and to leave to touch and joint-muscle feedback the task of 
achieving small, final adjustments. At all events, the locomotion illustration 
raises the important issue in the present context of how the visual detection of 
environmental properties relating to the modification and control of locomotion 
is realized in the language of the action system. 

With this issue in mind let us proceed to examine in some detail the problem 
of how an animal moves about in a stable environment. We take as our 
orientation J. J. Gibson’s (1958) analysis of locomotion and its control by 
vision. First, we recognize, following Gibson, two fundamental assertions: (1) 
the control of locomotion relative to the total environment is governed by 
transformations of the total optic array to a moving point; (2) the control of 
locomotion relative to an object in the environment is governed by transforma- 
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tions of a smaller bounded cone of the optic array—a closed contour with 
internal texture in the animal’s visual field. Second, and again respecting Gibson, 
we recognize the following as aspects of locomotion requiring our attention: (1) 
beginning locomotion in a forward direction; (2) terminating locomotion; (3) 
locomoting in reverse; (4) steering toward a specific location or object; (5) 
approaching without collision; (6) avoiding obstacles; (7) pursuing and avoiding 
a moving object. Additionally, we recognize that locomotion must be adjusted 
to the physical properties of the surface—its convexities and concavities, its 
slants and slopes, its edges. 

For each of the aspects of locomotion we can identify correspondences in the 
flow patterns of the optic array. Thus, to initiate locomotion in a forward 
direction is to activate and relate the coordinative structures that comprise the 
locomotor synergism (Gelfand et al., 1971) in such a fashion as to make the 
forward optic array flow outward; to cease locomotion is to terminate the optic 
flow; and to locomote in reverse is to pattern the locomotor synergism in a 
manner that makes the optic array flow inward. To move faster or slower is to 
make the rate of flow increase and decrease respectively. As Gibson (1958) 
remarks: “An animal who is behaving in these ways is optically stimulated in the 
corresponding ways, or, equally, an animal who so acts to obtain these kinds of 
optical stimulation is behaving in the corresponding ways [p. 187] .” Now during 
forward movement the center of the flow pattern is the direction in which the 
animal is moving, that is to say, the part of the array from which the optic flow 
pattern radiates corresponds to that part of the solid environment to which the 
animal is locomoting. If the animal changes direction then naturally the center 
of flow shifts across the array. Thus we can say that to maintain locomotion in 
the direction of an object is to keep the center of flow of the optic array as near 
as possible to that part of the structure of the optic array which the object 
projects. 

In moving about a stable environment an animal will approach solid surfaces 
that it will need to contact or avoid as situation and history demand. Objects are 
specified in the optic array by contours with internal texture. Areas between 
objects are specified either by untextured homogeneous regions (that is, sky) or 
by densely textured regions (that is, sand, grass). In approaching an object the 
closed contour in the array corresponding to the boundaries of the object 
expands with the rate of expansion for a uniform approach speed, accelerating in 
inverse proportion to the animal’s proximity to the object. If the animal is on a 
collision course with the object, then a symmetrically expanding radial flow field 
will be kinetically defined over the texture bounded by the object’s contours. 
On the other hand, if the expansion is skewed, that is, if the pattern of texture 
flow is asymmetrical then this specifies to the animal that it is ona noncollision 
course. A translation of the center of the flow pattern laterally to the animal’s 
right or to the animal’s left specifies that the animal will bypass the object on, 
respectively, its right or left. In Gibson’s (1958) account the guiding principle 
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for approaching an object without collision is to move so as to cancel the 
forward and relatively symmetrically expanding flow of the optic array corre- 
sponding to the object at the instant when “the contour of the object on the 
texture of the surface reaches that angular magnification at which contact is 
made [p. 188].”® And to avoid objects, to steer successfully around them, the 
animal needs to keep the center of the centrifugal flow of the optical array 
outside the contours with internal texture and inside the homogeneous or 
densely textured surface areas. 

Suppose now that the object to which movement is being directed is a moving 
object, as in the case of one animal pursuing another. We can again identify 
corresponding properties of the optic array. A prey fleeing a predator is specified 
by the fact that for the predator the overall optic array flows from a center, but 
a contour with internal texture within the overall flow pattern is not expanding; 
absolute expansion of the contour means that our predator is making good 
ground on its prey, contraction of the contour may mean that our prey will live 
to run another day. The principle of pursuit is summed up lightheartedly by 
Gibson (1958), “. . . the rule by which a big fish can catch a small fish is simple: 
maximize its optical size in the field of view [p. 188].” 

We see, in short, that controlling locomotion calls for the detection of change, 
detection of rates of change, and detection of rates of rates of change in the 
flowing optic array. It also calls for the detection of changes in parts of the 
Structure of the optic array with respect to the optic array as a whole. We 
assume that animals are sensitive to all of these properties of stimulation that 
vary over time and that they do indeed detect them (Gibson, 1966b; Ingle, 
1968). We should also note that modulating the optic array through movement 
and modulating movement through changes in the optic array go hand in hand; 
thus the cybernetical loop of afference, efference, reafference is virtually con- 
tinuous. 

But we must now face up to a point that has been neglected thus far. In 
directing its locomotion to one object and weaving its way among others, and in 
pursuing one moving object and fleeing another the animal exhibits its capacity 
to make discriminative responses. But these responses must be based upon 
different properties of stimulation from those that determine the control of 
locomotion: they are responses that are specific to those properties of the optic 
array which do not change as opposed to those which do; importantly, they are 
properties of stimulation that do not result from the animal’s locomotion. The 
animal must be able to detect permanent properties of its environment: it must 
be able to detect whether a surface affords locomotion and whether a contour 
with internal texture affords collision; it must be able to detect whether a 
moving textured contour affords eating or whether it affords being eaten. 


“For an interesting experiment in insect behavior that is of some relevance to these 
comments and to the general theory of perception—action relations see Goggshall (1972). 
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In respect to the surface supporting locomotion the terrestrial animal must 
detect the gradients of optical texture specifying slant and slope, the topological 
shearing of texture specifying edge, and the changes of texture gradient specify- 
ing convexities and concavities. As it moves rapidly across a rough terrain, it 
must adjust its footfall pattern, temporally and spatially; it must adjust its gait to 
the wrinkled surface. It must detect surface protuberances and surface breaks 
that require leaping over as opposed to those that require going round or 
avoiding; it will often need to make transitions between running and leaping. 
With respect to the permanent properties of the environment we concur with 
Gibson (1966b) that the animal can detect in the changing optical flux those 
mathematically invariant properties that correspond to the physically constant 
object or surface and which afford for the organism possibilities for action. 

We are led, therefore, to a distinction between those properties of stimulation 
which afford approach, avoidance, pursuit, flight, changes in the footfall pattern 
of a gait, and transitions from running to leaping, and those properties of 
stimulation which control locomotion in each of these respects. It would seem 
that the former are those properties which do not vary over time while the latter 
are those properties which do. And the pickup of change and nonchange are 
concurrent perceptual activities. 


Tuning Reflexes and Environment-Related Tuning 


At this stage of our inquiry as to how vision enters into locomotion (and into 
action in general) we turn our attention to the concept of tuning reflexes. In 
addition to those reflexes that resemble parts of acts, such as the flexion and 
crossed extension reflexes, or are themselves simple yet self-sufficient acts such 
as the righting reflex and the scratch reflex, we can identify a further class of 
reflexes whose task it appears is to impose biases upon the action system. We can 
distinguish, therefore, between “elemental” reflexes and “tuning” reflexes 
(Greene, 1969). As illustrations of tuning reflexes we can take classically defined 
postural or attitudinal (Magnus, 1925) reflexes such as the tonic neck reflex, 
which biases the motor apparatus for movement in the direction of gaze, and the 
labyrinthine reflexes, which bias the musculature to resist motion on an incline 
or to resist rotation (Roberts, 1967). Quite recently evidence has been for- 
warded of low-level tuning resulting from movements of the eyes (Easton, 1971, 
1972a). In the cat, stretching the horizontal eye muscles facilitates a turning of 
the neck and head from the direction of gaze and stretch of the vertical eye 
muscles influences the forelimbs. Indeed, it appears that the eyes looking 
upward might foster forelimb flexion and the eyes looking downward might 
foster forelimb extension (Easton, 1972a). 

The principal function of tuning reflexes seems to be that of altering the 
intrinsic system of segmental relations rather than that of initiating configura- 
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tions of motions in components of the motor machinery.’ The impression is 
that tuning reflexes adjust the bias in the fundamental servomechanisms (cf. 
Gernandt, 1967). In general it may be argued that the main advantage of tuning 
reflexes, whether induced by prior motion or induced more directly, is a 
reduction in the detail required of high-level instructions (Easton, 1972a). Thus 
when a cat turns its head to gaze at a passing mouse, the angle of tilt of the head 
and the degree of flexion and torsion in the neck will elicit a reflex modulation 
of the segmental apparatus such that a broadly stated executive instruction to 
“jump” will be realized as a jump in the right direction (Magnus, 1925; Ruch, 
1965b). Clearly, such modulation must precede the innervation of muscles or 
the cat would constantly miss its target; obviously the cat in flight cannot rely 
on corrective feedback. 

How do tuning reflexes relate to the visual control of locomotion? Analysis of 
the biomechanics of walking and running in animals (e.g., Arshavskii, Kots, 
Orlovskii, Rodionov, & Shik, 1965; Shik & Orlovskii, 1965; Shik, Orlovskii, & 
Severin, 1966) reveals that with change in speed or gait the majority of 
kinematic parameters is kept constant, suggesting that adjustments in the loco- 
motion plan require a relatively minimal change in coordination. The action 
problem posed by the need to change speed of running or gait may be solved in 
most instances by a change in only two parameters. May we suppose, therefore, 
that a change in a small set of parameters is all that is needed to control 
locmotion through a “wrinkled” and object-cluttered terrain? Movement in 
a forward direction calls for a particular organization of the basic coor- 
dinative structures. If an animal so moving detects an invariant specifying an 
object or surface in its path that is to be avoided, then it must alter the 
organization of the relevant coordinative structures in order to change direction. 
But change in direction need not actually require direct executive intervention in 
the low-level organization of the locomotion plan; a shift in the direction of gaze 
may be all that is needed. In theory at least the tonic neck reflexes and related 
tuning reflexes could effect the necessary reorganization of the segmental 
apparatus. Similarly, if the contoured texture in the optic array afforded 


*The Potential range of changes in the segmental system induced by postural changes, and 
their implications for the behavior of coordinative structures is suggested in the following 
paragraph from an address delivered by Magnus almost fifty years ago: 


Every change in attitude, with its different positions of all parts of the body, changes 
the reflex excitability of these parts and in some cases changes also the sense of the 
reflex evoked, excitations being converted into inhibitions, reflex extensions into 
flexions and so on. One and the same stimulus applied to one and the same place on the 
body may give rise to very different reactions in consequence of different attitudes 
oe have been imposed to the body before the stimulus is applied [Magnus, 1925, p. 


For further intriguing and provocative comments on tuning reflexes see Jones (1965) and 
Fukuda (1957). 
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jumping on, then the act of directing the eyes or eyes and head upward would 
facilitate the transition in segmental organization from that of running to that of 
jumping. 

These examples suggest the following: in the course of locomotion the detec- 
tion of invariants affording specific changes in locomotion may serve to activate 
singularly simple action plans such as a change in the direction in which the head 
and/or eyes are pointing. Often these adjustments in orientation—owing to the 
functional tuning link between head and eye movements and the segmental 
apparatus—are sufficient to produce the needed parameters for the segmental 
realization of change in locomotion. 

As we have noted, the optical stimulation for a moving animal has components 
of both change and nonchange (Gibson, 1966b). If the components of non- 
change, specifying the permanent entities in the animal’s environment, relate to 
action plans and their activation, to what do the changing components of 
stimulation relate? We must suppose that they relate to the mechanisms of 
tuning; but how is this relation effected? The following considerations may help 
us to move toward an answer to this question. 

To leap from object to object is to project the body in particular trajectories, 
with each trajectory requiring different horizontal and vertical vectors of exten- 
sion thrust, Variations in force could be achieved either through variations in the 
degree of activation of coordinative structures or parts of coordinative structures 
as might be permitted by the local sign properties of reflexes (the dependency of 
reflex patterns on the origin of stimulation) or through direct facilitation of 
motor neuron activity, or both (cf. Easton, 1972a). In theory, both of these 
sources of force variation are plausible instances of tuning. We can say therefore 
that each leap calls for the specification of parameters to the intrinsic system of 
segmental relations where these parameters relate to the desired trajectory. Now 
we might ask whether trajectory-related parameters could be determined 
through tuning reflexes. But cursory analysis would suggest that mechanical 
modulation—spinal tuning elicited reflexively by a prior motion such as directing 
the eye-head system toward an object—is inadequate for the task. Consider a cat 
perched on a particular platform. At a distance of X feet from its perch is 
another, higher platform. Directing its gaze to the top surface of the higher 
platform yields, say, a particular angle of neck extension and hence a particular 
tuning of the segmental apparatus. Yet we observe that we could arrange any 
number of higher platforms of different heights at any number of reasonable 
distances either more or less than X feet from the cat’s perch that would 
correspond to the same inclination of the neck and hence to the same tuning 
parameters and hence, supposedly, to the same degree of thrust if the cat chose 
to jump. In brief, reflex tuning induced by any particular orientation of the 
eye-head system is ambiguous with respect to distance. Mechanically induced 
tuning, therefore, cannot supply the tuning parameters relevant to a given 
trajectory. How then are they supplied? We are forced to conclude that they are 
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supplied by the properties of the optic array that specify relative distance and 
height in the cat’s normal cluttered and textured environment, and that these 
optical properties are realizable as segmental tunings without the intervention of 
executive procedures and without mechanical mediation. 

With this conclusion in mind consider what we might now say about the 
scenario that unfolds when a scampering mouse appears at a leapable distance 
from an interested cat. In the cat’s field of vision the mouse is projected as 
changing patterns in the optic array. Concurrently, there is a pickup by the cat’s 
visual system of those properties of stimulation that change over time and those 
properties that do not. The former specify how far away the mouse is, in what 
direction it is moving, at what rate it is moving, and where it will be in a 
following instant relative to the cat; the latter specify the mouse’s identity as 
something that affords catching and eating. Orienting in the direction of the 
mouse adjusts the segmental apparatus through the tuning reflexes for a move- 
ment in that direction; as the direction of gaze shifts according to changes in the 
mouse’s location the mechanically induced segmental tuning likewise adjusts 
appropos the new direction. On activation of the action plan to pounce, the 
tuning parameters for the needed trajectory specified by the transformations in 
the optic array are given to the segmental system of interactions. The activation 
of coordinative structures then takes place against a backdrop of segmental 
relations appropriately adjusted for the generation of a precise, on-target leap. 

In this cat-and-mouse story there are two main themes: one is that the 
activation of crudely stated action plans and environment-related tuning are 
based on different properties of stimulation; the other is that the properties of 
visual stimulation that control movement and the family of possible tunings that 
effect the control are tightly linked. In Gibson’s view, perception is direct. He 
has also remarked that: “The distinction between an S-R theory of control 
reactions and and S-R theory of identifying reactions is important for behavior 
theory” (Gibson, 1958, p. 190). On this distinction we might now comment that 
in control reactions the relation is between changing properties of stimulation 
and patterns of tuning, and in identifying reactions it is between nonchanging 
properties of stimulation and action plans. 

Mittelstaedt (1957) describes a similar story about prey capture in the mantis. 
A mantis strikes its prey with pinpoint accuracy within a latency of 10-30 
msec, a period too brief to allow for adjustments during the course of the strike 
trajectory, The problem is to account for how this accuracy is achieved when the 
prey appears at a strikable distance either to the left or to the right of the body 
axis at some variable angle; and when the head is oriented at some (different) 
angle to the prothorax with which the forelegs—the striking instrument—are 
articulated. Mittelstaedt’s modeling of this sutuation implies that the visual and 
proprioceptive information specifying the relevant relations is conveyed not to 
the executive issuing the strike signal but to the segmental machinery of the 
forelegs. On our account we would say that the higher-order invariant specifying 
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“prey” triggers the strike command (the strike action plan) but the properties of 
optical stimulation specifying the coordinates of the prey with respect to the 
body axis, and its rate and direction of movement, do not enter into the 
executive decision, for most assuredly that would introduce undesirable delays. 
Rather, these properties are realized as segmental tuning parameters effecting 
needed adjustments in the centers controlling foreleg extension. We may say of 
the mantis’ prey catching that the prey determines the ballistic component while 
the prey’s location and movement determine the tuning component in a mixed 
ballistic-tuning strategy. Moreover, we recognize what might indeed be a general 
principle, namely, that different properties of stimulation enter into the unfold- 
ing act at different levels. 

In respect to this last point let us make one final comment on the topic of 
locomotion, which began this particular phase of our inquiry. We have argued 
that environment-related tuning is relatively independent of executive proce- 
dures. For locomotion we can say that tuning is coupled to the pickup of 
information conveyed by continuous transformations in the optic array. While 
the detection of higher-order invariants (affordances) may inject gross adjust- 
ments in locomotor activity, the fine control of locomotion in an object-clut- 
tered and wrinkled terrain is through environment-related tuning which adapts 
the activity to the conditions by modulating a relatively small set of parameters, 
and does so without involving the higher domains of the action coalition/heter- 
archy. 

Pal’tsev (1967a, b) advanced a theory that is of special relevance to this 
account of locomotor regulation. First, Pal’tsev (1967a) recognizes that in 
respect to uniform movements an argument can be made that in addition to 
movement-related segmental pretuning there is another type of reorganization of 
the segmental relations that is brought about during the execution of the 
movements. This latter form of tuning is due in Pal’tsev’s (1967a) view to the 
fact that to a very large extent the interactions among different structures of the 
spinal cord are reorganized by processes that are inherently spinal. The seg- 
mental apparatus tunes itself, as it were, in harmony with the main supraspinal 
influence. By comparing experimental results on the effects spinal reflexes 
induce in neighboring spinal reflexes with the general picture of locomotion 
Pal’tsev (1967b) is led to the supposition that following the first few locomotor 
cycles the strategic ordering of muscle events in locomotion can be determined 
solely by the segmental system of relations. As he sees it, the supraspinal 
patterns of feedforward serve only to identify and to “trigger” the particular 
locomotion plan: the continuation of the plan, the subsequent locomotor cycles 
in walking or running, is then the responsibility of spinal processes. Which is to 
say that control of locomotion is simply and elegantly transferred from supra- 
spinal structures to spinal structures. Thus, locomotion exhibited in the pursuit 
by a predator of its prey could proceed with insignificant involvement of the 
highest sectors of the action system. If such is the case, then it would be 
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propitious for the nervous system to exploit the principle of conveying visually 
specified adjustments in locomotion relatively directly to the segmental appa- 
ratus in which locomotion control is invested. This conclusion is consonant with 
the often expressed point of view that the spinal cord is a system that during 
action serves to integrate different supraspinal influences (cf. Pal’tsev & El’ner, 
1967). 


Two Kinds of Vision 


In some respects the ideas just expressed are reminiscent of the claim that there 
are two separate but interdependent visual systems related to action (Trevarthan, 
1968). It appears that a distinction is drawn in the neuroanatomy of the brain 
“between vision of relationships in an extensive space and visual identification of 
things” (Trevarthan, 1968, p. 301). In its simplest form the distinction is 
demonstrated most straightforwardly by the experiments of Schneider (1969): a 
hamster with intact superior colliculus but no visual cortex can orient to objects 
but cannot distinguish between them; conversely, with intact visual cortex and 
no superior colliculus the hamster can successfully distinguish objects but cannot 
locate them and orient to them except through trial-and-error. In very general 
terms it appears that there is a functional differentiation between two kinds of 
vision that relates in part to forebrain—midbrain differences. 

Let us remark briefly on the vertebrate midbrain. Suppose that we drew a map 
of the projections from the eyes to the midbrain tectum and suppose that we did 
so for two dissimilar vertebrates, the goldfish and the cat. The eyes of the 
goldfish are aligned roughly perpendicular to the body axis, while those of the 
cat are aligned parallel to the body axis. If one drew maps in the optical 
coordinates of the eye, one would find that for the two vertebrates the 
projection from the eyes to the midbrain differed considerably. But if the maps 
were drawn in the coordinates of the behavioral field, that is, with respect to the 
symmetry of the body, one would observe that the two maps were virtually 
identical. Indeed, if one went on to obtain such maps for other vertebrates, one 
would find that the mapping from eyes to tectum in the coordinates of the 
behavioral field is relatively invariant, and thus indifferent to the variation in 
alignment between eyes and body axis (see Trevarthan, 1968). One might 
conjecture that body-centered visual space is represented by a precise topograph- 
ical mapping in the midbrain in very much the same way in all vertebrates. 

This map of visual loci also maps a topography of points of entry into the 
action system. Stimulating points on the tectum produces orienting movements 
of eyes, head, and trunk to the corresponding visual location (cf. Apter, 1946; 
Ewert, 1974; Hyde & Eliasson, 1957). A singularly important feature of the 
midbrain is that in respect to the symmetry of the body it provides a precise 
topographical map of points in visual space and a virtually identical map of 
orienting movements to those points (Apter, 1946). Because of this feature the 
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midbrain serves to map object locations onto the set of movement-induced 
tunings. But there is reason to suppose that the capabilities of the midbrain 
extend beyond this and are concerned in a more general way with the control of 
locomotion. 

Let us say that the two kinds of vision relating to forebrain-midbrain differ- 
ences relate in turn to different kinds of acts performed in the animal’s 
behavioral space. Discussing primates, Trevarthan (1968) conveys the tenor of 
this viewpoint as follows: 


Orientations of the head, postural adjustments, locomotor displacements change the 
relationship between the body and spatial configurations of contours, surfaces, events, 
and objects. These movements occur in what I shall call ambient vision. In contrast 
praxic actions on the environment to use pieces of it in specific ways are performed with 
the motor apparatus of the body and the visual receptors oriented together so that both 
vision and the acts inflicted on the environment occur in one part of the behavioral 
space. The vision applied to one place and a specific kind of object, or deployed in a 
field of identified objects, I shall call focal vision [p- 302].” 


Trevarthan builds his case upon facts found in the effects of surgically 
separating the cerebral hemispheres. This separation exhibits many instructive 
and curious phenomena, including that of central concern to Trevarthan’s 
thesis—the capability of the split-brain primate to double perceive and learn for 
some types of visual stimulation but not for others and correspondingly to 
perform some aspects of visually defined acts chaotically and yet to perform 
others with no evident impairment. We note that the separated cortices may 
learn, independently and simultaneously, conflicting solutions to a visual dis- 
crimination problem when the stimuli are clearly of different identities, as in the 
example of cross versus circle, but not when the stimuli differ on a single 
dimension such as bright versus dim, In the former case, that of an identity 
difference, what is learned by one hemisphere is available to the other only if it 
in turn has the opportunity to learn the same thing; in the latter case, what is 
learned by one hemisphere is without practice available to the other. The 
inference is that differences in degree may be apprehended by visual mechanisms 
of the midbrain, while the apprehension of differences in identity is the responsi- 
bility of cortical visual mechanisms (Trevarthan, 1968). And Trevarthan empha- 
sizes that it is the transformations relating the to-be-distinguished stimuli rather 
than the ease of distinguishing between them that is important to the dissocia- 
tion. 

Paralleling this dissociation in split-brain vision is a dissociation in split-brain 
action. If an object such as a peanut is presented to the commisurectomized 
primate both hands may reach forward with precision to grasp it; however, the 
activities of the two hands appear indifferent to each other, resulting often in 
collision. Given an object to manipulate and explore, the split-brain displays an 
inability to relate the activities of the two hands. The needed collaboration is 
replaced by redundant and conflicting movements. 
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In sharp contrast to these anomalies of voluntary movements of the hands in 
the field of focal vision, no such schism is witnessed in locomotion in which the 
hands play an important role. Locomotion-related movements of the arms and 
hands are properly coordinated to each other and to the motions of the hind 
limbs; and in terms of displacement, velocity and timing are finely attuned to 
the environmental structures supporting the action (Trevarthan, 1968). 

While there are many more questions to be asked of these dissociations in 
vision and action and of the relation between them we can with some reasonable 
certitude draw the following conclusions. First, low-level sections of the visual 
system can effect the pickup of transformations in the optic array corresponding 
to changes in gross environmental properties such as texture and contour and to 
the detection of simple invariants such as solidity—in short, those properties of 
the optic array relevant to the control of locomotion. And in this regard it is of 
some import to note that electrical stimulation of the midbrain can bring about 
parameter changes in the segmental functions governing locomotion (Shik, 
Severin, & Orlovskii, 1966). Second, the higher-level sections of the visual 
system detect higher-order invariants specifying identity and more complex 
transformation that would be relevant to and indeed result from the skilled 
manipulation of objects. For it is evident that separating the hemispheres gives 
rise to two separate visual frames for the regulation of manipulative behavior and 
to a consequent breakdown in coordination between the two hands, but leaves 
intact the visual frame for the regulation of locomotion. 


RELATING THE CONTENTS OF VISION TO ACTION: 
A SUMMING UP 


We come now to a general summary of these speculations on how vision enters . 
into action. We have provided two rather different descriptions of the unfolding 
act, and it will be helpful to collect them together at this time. In one we 
envisaged the act as evolving through the establishment of progressively less 
abstract representations, from the specification of relations among and within a 
few relatively large coordinative structures to the specification of relations 
among and within many relatively small coordinative structures, the fundamen- 
tal servomechanisms. In the other we saw the action plan unfolding as ordered 
successions of progressively less abstract projections of the environment. What 
we must now attempt is a reconciliation of these separate views. 

The kernel notion is this essay has been the idea of building acts through the 
fitting together of relatively autonomous units. This principle of operation 
reflects the fundamental argument that there are far too many degrees of 
freedom in coordinated activity for it to be controlled by a single procedure in a 
single instant. One consequence of this viewpoint is that the initial representa- 
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tion of an act in the highest domain must necessarily be crude in comparison to 
its ultimate representation in terms of instructions to muscles. 

Similarly we saw that in view of the degrees of freedom problem the represen- 
tation in the highest domain could not be constructed in respect to the details of 
skeletal space; the perception of the disposition of the limbs and branches of the 
body at any moment can only enter into the representation in the most general 
way as an abstracted account of the body’s “pose” at that instant. Using very 
much the same rationale we are led to believe that in interactions with the 
environment not all the contents of vision can be involved in the determination 
of the initial representation of an act. Again we suppose that the executive 
procedure uses only the perceptual description that it can handle; the descrip- 
tion cannot be detailed and by necessity must be fairly abstract. Earlier, 
following Bernstein (1967), we used the term “topological properties” to iden- 
tify the description of the environment to which the initial representation of the 
action plan related. We may now regard these properties as invariants of a higher 
order, for example, those which specify the identities of objects and their 
possibilities of transformation. In any event, the manifestation of the action plan 
as motions finely attuned to the nuances of the environment’s structure tells us 
quite plainly that the detailed contents of vision must be interjected into the act 
during its evolution. We have argued that tuning of coordinative structures is 
probably the mechanism through which the interjection of environmental details 
is brought about. 

If these speculations are not too far off the mark, then we might further 
conjecture as follows. The determination of an act as an orderly pattern of 
motions is distributed across many structures. In the coalitional/heterarchical 
language used above, we say that it is distributed across different domains. But 
where the differentiation of an action plan requires information about the 
environment, we should suppose that the procedures operating at each domain 
incorporate optically specified environmental properties. It seems unlikely how- 
ever that the entry of environmental properties into the various representations 
of an unfolding act is a haphazard affair. Rather, we hypothesize that the 
properties of the optic array interlace with the representation of an action plan 
in a systematic fashion: different properties map into different representations. 
We have, of course, already implied that this might be the case in arguing that 
the specification of action plans and the tuning of structures correspond to 
different properties of stimulation. But now suppose that the properties of 
stimulation relevant to the control of action may themselves be arranged from 
more complex to more simple; then perhaps we can imagine a natural mapping 
of these properties onto the unfolding act, a mapping that preserves their order. 
Of the properties of optical stimulation relevant to the control of action, those 
of a higher order are realized as tunings in higher domains and those of a lower 
order are realized as tunings in lower domains of the action heterarchy. 
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We conclude with some final thoughts on the general characterization of the 
perception-action relation. With respect to the representation of the action plan 
in the highest domain, it is not so much that the specification of a subset of large 
coordinative structures and functions defined on them relates to higher-order 
properties of the optic array but rather that the description of the plan in action 
terms and the description of the plan in perceptual terms are dual statements 
about the same thing. Earlier we described the action concept for A writing as an 
operator defined over a set of functions relevant to the manipulation of coordi- 
native structures; but we have also referred to the action concept for A writing 
in geometrical terms consonant with the points of view expressed by Bernstein 
(1967) and Lashley (1951), and further suggested that the two descriptions were 
isomorphic. Similarly, with respect to tuning, we have implied that there is a 
relatively direct mapping of the properties of optical stimulation relevant to the 
control of action onto the set of tunings. To draw these concepts together, we 
can say that “detection of control-relevant optical properties” and “specification 
of environment-related tuning parameters” are descriptions of the same event: 
one is the dual of the other. 

Perhaps we can gain a purchase on the duality of perception and action events 
by considering a problem drawn from a rather special domain of the perception- 
action relation—communication between members of the same species. For a 
variety of reasons it has been suggested that the perception of sounds of speech 
is achieved by reference to the mechanisms of articulation (see Galunov & 
Chistovich, 1966; Liberman ef al., 1967; Zinkin, 1968). One version of this 
action-based theory of speech perception suggests that the listener seeks to 
determine (tacitly and unconsciously of course) which phoneme articulation 
plans could produce the acoustic pattern; the listener uses the inconstant sound 
to recover the articulatory gestures that produced it and thereby arrives at the 
speaker’s intent (Liberman et al., 1967). Other students of speech, however, 
have argued against the articulatory matching explanation of the perception of 
speech sounds and have suggested that the explanation be sought in the sensitiv- 
ity of the nervous system to higher-order properties of acoustic stimulation (e.g., 
Fant, 1967, Abbs & Sussman, 1971). There is growing evidence for neural 
mechanisms that selectively respond to complex acoustic invariants (e.g., Roeder 
& Treat, 1961, Frishkopf & Goldstein, 1963; Capranica, 1965), and it is becom- 
ing increasingly less venturesome to propose that the perception of phonological 
attributes of speech is direct rather than mediated (cf. Abbs & Sussman, 1971). 
However, viable descriptions of invariants in speech stimulation have been 
elusive. 

Commendable as a direct perception interpretation is, we still must account 
for the evidently tight coupling between structures detecting speech sounds and 
Structures producing speech sounds (see Chistovich, 1961; Chistovich, Fant, 
deSerpa-Leitaö, & Tjernlund, 1966). Suppose, as Gibson (19665) suggests, that 
vibratory patterns specify their source. Then we can say that a listener perceives 
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articulation because the invariants of virbration correspond to the invariants of 
articulation: the phonemes are present in the neural activity and vocal tract 
activity of the speaker and in the air between the speaker and the listener. Thus 
the linguistically relevant invariants on the input side are the same as the 
linguistically relevant invariants on the output side, and it is in this sense that 
perceiving and producing speech correspond (cf. Halwes & Wire, 1974). Now 
suppose that we were to describe an articulatory action plan as a set of relations 
defined over a collection of coordinative structures, then we would argue that 
our description is also a description of the relevant relations in the acoustic 
pattern. An appropriate analogy is the group concept in mathematics: given two 
different sets of elements, with a group structure defined on each, we might find 
that although the elements differ (even radically) in the two instances, their 
manner of inner interlocking is the same, in which case we say that they 
represent the same abstract group. Our hypothesis, therefore, is this: the struc- 
ture that affords perception of a speech sound also affords its production; 
speech perception and speech production are related by abstract structures that 
are common to both but indigenous to neither (cf. Turvey, 1974). There is some 
evidence, though slight, that structures with this dual property may have been 
exploited in the evolution of intraspecies communication. For example, the 
calling song of male crickets is composed of stereotyped rhythmic pulse intervals. 
Cross breeding of two species of crickets with marked differences in the 
rhythmic structure of their songs produces hybrids whose calling song is dis- 
tinctly different from either parental song. It has been shown that genetic 
differences that cause song change in males also alter song reception in the 
females: hybrid females prefer the song of hybrid males (Hoy & Paul, 1973). 
What intrigues is the speculation that the action plan for song generation in the 
male and the female’s selective sensitivity to the male’s song are coupled through 
a common set of genes (Hoy & Paul, 1973). Thus, at some level of abstraction 
the same structure may underlie song production in the male and song reception 
in the female. 

Whether a stronger and more general case can be made for the dual representa- 
tion notion remains to be seen. There is the possibility, of course, that the 
principle we have tried to describe has meaning, if at all, only in the communica- 
tion mode: speaking and perceiving speech, reading and writing, and the primi- 
tive instantiations of signaling in animal and insect communication. On the other 
hand, when one considers the failure of schemes in which sensory input is routed 
through a central network into motor responses, the growing uneasiness over the 
application of the terms “sensory, motor, associative” to higher neural struc- 
tures, the increasing usage of the bimodal term “sensorimotor” (see Evarts et 
al, 1971, for comments on each of these points), and the arbitrariness of 
action-based theories of perception, then the notion of perceiving and acting as 
dual representations of common neural events may be a reasonable alternative to 
the sensory and motor views of mind. 
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A Conceptual Framework for 
Cognitive Psychology: 
Motor Theories of the Mind’ 


Walter B. Weimer 
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The recent trend away from the behavioristic paradigm has become all but a 
stampede, as the resurgence of cognitive psychology illustrates. Yet as has often 
happened in the history of science, there is a danger that this paradigm (in 
Kuhn’s, 1970, sociological and exemplary senses of the term) will be aban- 
doned before the real reasons for its inadequacy are fully understood by the 
domain’s practitioners. Such paradigm clashes tend to obscure important things, 
such as that the successor paradigm, to all intents a totally contradictory 
position, will often share the most important mistakes of its predecessor. This is 
especially true at the metatheoretical level, and a “new look” theory will often 
have limitations within itself that its protagonists will have cited as reasons for 
rejecting its predecessor. Much of cognitive psychology, as the “revolutionary” 
successor to behaviorism, seems to have made exactly this mistake with regard to 
its conception of mind and its functioning. 

This chapter argues that much contemporary ‘cognitive psychology”? is neither 
cognitive nor adequate as a psychology, because it shares an inadequate meta- 
theoretical conceptual framework with its predecessors (such as behaviorism). 


"This chapter is adapted from Chapter 10 of my forthcoming Structural Analysis and the 
Future of Psychology. Some of its major themes were presented at the Conference on 
Perception, Action, and Comprehension held at the Center for Research on Human Learn- 
ing, University of Minnesota, August, 1973. 

? Throughout the customary philosopher's distinction separates single from double quota- 
tion marks. Double quotation marks indicate direct citation or the use of a word or phrase 
in its traditional connotation; single or scare quotes indicate that the traditional connotation 
does not apply. 


267 


268 WALTER B. WEIMER 


Despite its claims to be revolutionary, and its affiliation (through psycholinguis- 
tics) with the revolution in linguistics occasioned by Chomsky, current cognitive 
theory and research is little better than work done within behaviorism. I wish to 
argue that the major advances, such as denial of the sufficiency of associationism, 
and of the utility of the “principles” of reinforcement and learning (conditioning), 
readmittance of mentalistic entities and theoretical formulations, and even the 
adoption of a constructive approach to cognition, are not enough. A ‘cognitive’ 
psychology that makes these changes without repudiating the conceptual frame- 
work underlying behaviorism (and its predecessors) is no more adequate than 
behaviorism. Where theory and research within such a cognitive framework have 
managed to contribute to our understanding of mind, it has been in spite of that 
framework rather than because of it. 

Such a sweeping claim needs to be given specific formulation, My argument is 
wholly metatheoretical. The implicit (occasionally explicit) conceptual frame- 
work behind such cognitive theories is inadequate. Both cognitive psychology 
and behavioristic psychology fail because they embrace a sensory conception of 
mind. Conversely, an adequate and truly cognitive psychology must adopt a 
different metatheoretical framework and explore theories consonant with what 
can be called a motor conception of mind. This chapter attempts to characterize 
both sensory and motor conceptions of mind (and the theories consonant with 
them) (Note 1) and argues for the indispensability of motor theories for any 
adequate psychology. 


1 THE SENSORY METATHEORY OF MIND 


Psychology, with its long past and short history, has modern formulations 
stemming from ancient roots. Since the time of Aristotle the mind has been 
regarded as intrinsically sensory in nature, as a passive black box or window that 
is (somehow) sensibly impressed by input from the environment. A root meta- 
phor of mind has evolved from the common-sense, everyday experience of 
looking at the world. Vision, conceived as the passive reception of information 
that both exists and possesses an intrinsic psychological character independently 
of the organism, became the paradigm exemplar of mental processing. At the 
same time, behavior, or motor activity, became divorced from mental activity 
and was seen as a consequence, or by-product following from genuine (or 
sensory) mental events. 

This “common-sense” reasoning, although a refreshing and bold conjecture to 
the ancient Greeks, has now become a literally truistic and trivial common-place. 
To argue against this root metaphor of the sensory conception seems at first 
inconceivable. Arguments in favor of the motor conception are often rejected 
out of hand by theorists who reason that “everybody knows that x, y, and z” 
where these variables range over essential tenets of the sensory metatheory. This 
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chapter argues that what everybody knows, the common sense of contemporary 
psychology, is not only dead wrong both conceptually and empirically, but also 
retarding the development of an adequate psychology. It argues against a 
position whose familiarity has made it credible and acceptable, in support of an 
alternative which is (initially, at least) seemingly incredible, and therefore, 
unacceptable. We shall proceed by examining some aspects of the sensory 
metatheory per se (this in itself is a significant task, since the transparency and 
obviousness of the position guarantee that it is rarely scrutinized—it has become 
conceptually invisible), by providing scanty documentation of the position in 
contemporary literature, and then by arguing that on the basis of considerable 
highly corroborated data and theory, it is actually the sensory metatheory that is 
incredible and to be rejected. 


1.1 Some Basic Sensory Tenets 


Metatheories are like perspectives or vantage points: they provide a point of view 
from which a domain may be scrutinized. In providing such conceptual under- 
pinning to a domain of inquiry a metatheory is in itself all but invisible: one 
“sees” the domain through the conceptual glasses that constitute the meta- 
theory, but one does not see the metatheory itself. Thus, a metatheory is 
characterized only indirectly, by pointing out those of its aspects that structure 
and constrain the domain addressed. Taking the sensory metatheory in its 
simplest abstraction, as the view that everything that is properly mental (for the 
philosopher) or cognitive (for the psychologist) involves the sensory (specifi- 
cally, what has classically been called the sensory component of the nervous 
system), let us look briefly at how it has colored three main approaches to 
cognitive psychology. 

Consider the following statement from a text rightly regarded as having 
reintroduced cognitive psychology to America, after five decades of behav- 
iorism: 

The term “cognition” refers to all the processes by which the sensory input is trans- 

formed, reduced, elaborated, stored, recovered, and used. It is concerned with these 

processes even when they operate in the absence of relevant stimulation, as in images 
and hallucinations. Such terms as sensation, perception, imagery, retention, recall, prob- 
lem-solving, and thinking, among many others, refer to hypothetical stages or aspects of 
cognition. . .. The organization of this book follows a sequence which is logically 
implied by the [above] definition. .. . It follows stimulus information “inward” from 
the organs of sense, through many transformations and reconstructions, through to 
eventual use in memory and thought [Neisser, 1967, pp. 4, VII]. 


This approach makes cognitive psychology, which Neisser is at pains to point out 
is an approach to the entire psychological field, a matter of “Stimulus Informa- 
tion and its Vicissitudes [p. 4] .” Not surprisingly, there are no chapters on motor 
functions, such as skilled behavior or coordination and control, to be found in 
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this book. Neisser’s contribution, as evidenced by his own evaluations, has been 
to reintroduce a constructive approach to cognition, and to argue for the active 
nature of the processor who cognizes (especially in viewing memory as recon- 
struction rather than storage; see the “Utilization Hypothesis” in Chapter 11) 
(Note 2). In gross oversimplification then, cognitive psychology for this text is 
the application of constructive formulations to the entire realm of information 
(stimulus) processing. 

An entire generation of researchers has grown up trained on Neisser’s book, 
and it is typical to find in the literature echoes such as this, written as 
uncontroversial statement of fact in a book review by a theorist in the informa- 
tion-processing camp: 


When stimulation is first presented to the mind, it is automatically acted upon by the 
system in a series of stages. At each stage features are abstracted and used for 
subsequent stages of abstraction. Thus, a word presented visually may be encoded in 
features first representing areas of dark and light, then lines, angles, shapes, and so on, 
then letters, then letter groups, then words coded linguistically and semantically, then 
synonyms and associates, then the overall context or situation in which the word 
appears. ... The features that are abstracted are placed in the short-term memory store, 
where they may be used by the subject. .. . Eventually some portion of this information 
in short-term memory is transferred to, and stored in, long-term memory and the coding 
process is complete [Shiffrin, 1973, p. 400]. 


Similar comments appear in the revival of the psychology of consciousness 
that is an indispensible component of Ornstein’s (1972) attempt at a synthesis of 
traditional and esoteric traditions: 


It is the function of sensory systems, then, by their physiological design to reduce the 
amount of “useless and irrelevant” information reaching us and to serve as selection 
systems. The information input through the senses seems to be gathered for the primary 
purpose of biological survival [p. 23].... We first select modalities of personal con- 
sciousness from the mass of information reaching us. This is done by a multilevel process 
of filtration, for the most part sorting out survival-related stimuli. We are then able to 
construct a stable consciousness from the filtered input [p.17]. 


Common to these positions is an implicit notion that cognition is to be 
understood “from the outside inward,” that it is a matter of the structuring and 
restructuring of sensory information by intrinsically sensory systems, and that 
the products of cognition must somehow subsequently be married (in a peculiar 
sort of shotgun wedding) to action. Thus, cognition has a puzzling, dualistic 
character for sensory theorists: “Many complex memory representations appear 
to have a dual character. On the one hand, they can be brought to consciousness 
and inspected like an image. On the other, they can produce a particular 
behavior [Posner, 1973, p. 8].” And in all, constructive theorizing is not an 
issue: even Skinner (though he would never admit it) is a constructivist when 
dealing with cognition. 
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1.2 The Communalities of Sensory Psychology 


Since the recent abandonment of passive, empty organism stimulus-response 
formulations in favor of active, constructive approaches is usually taken as a 
hallmark of the cognitive revolution, this last point may occasion surprise. Thus, 
it may be helpful to detour through some problems (and their tentative solu- 
tions) that have occupied the sensory metatheory. We can see some of the 
communality in opposing views by looking at such seminal issues as what is 
learned in perception, what is stored in memory, and the genesis of action. 


1.2.1 Perception and the informational basis of behavior. Behaviorism pro- 
claimed its scientific status by decreeing that the stimulus must be physically speci- 
fiable. Having done so it replaced “perception” by “disposition to respond,” claim- 
ing that the problem of perception is to specify the particular stimuli that result, 
via learning, in particular behaviors. The retreat from classic behaviorism acknowl- 
edged the abstract nature of the stimulus, and the distinction between functional 
and physical stimuli arose. It got to the point that the ‘stimulus’ in, for example, 
concept learning became a relational concept (Bourne, 1966). With this admis- 
sion the behaviorist became a cognitive psychologist who, for better or worse, 
clings to an old “habit” of classifying everything as either a stimulus, response, 
or reinforcement. Where is the revolutionary difference between the cognitive 
and behavioral theorist if both acknowledge that the informational basis of 
perception (and hence, learning) is intrinsically abstract? 


1.2.2 Memory and the utilization of “stored” information. For classic behav- 
iorism, as Neisser points out in discussing the Reappearance Hypothesis, what was 
learned and what was remembered (habits or associations) were enduring entities 
that could be somehow retrieved from a storage compartment deep within the 
organism. The “revolution” has been to abandon storage of surface elements in 
favor of the ‘storage’ of deep structure rules for the generation of elements. Note 
2 has quoted how Skinnerian behaviorism has claimed exactly this position as its 
own. Let us quote Skinner (1974) again: 

The metaphor of storage in memory ... has caused a great deal of trouble. The computer 

is a bad model. . . . (It is not the behaviorist incidentally, but the cognitive psychologist, 

with his computer-model of the mind, who represents man as a machine.) [p. 110] 


All that is wrong with this quotation is that Skinner has misperceived a 
monolithic opposition, and fails to distinguish storage theorists within the 
cognitive camp (e.g., Melton & Martin, 1972; Miller, 1974) from those who do 
not utilize computer metaphors (e.g., Gibson, 1966; Neisser, 1967). So both 
behaviorism and cognitive psychology do and do not utilize storage concepts to 
account for memory. Where are the revolutionary differences? What principled 
distinctions remain when a behaviorist such as Paivio (1971) can call himself a 
cognitive psychologist simply for studying images? 
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1.2.3 Action and the initiation of behavior. How is action generated? If we 
grant any analysis of the input or stimulus information whatsoever, how is it 
connected to behavior? Naive behaviorism (of the Watsonian genre) believed the 
myth that all behavior is reflexive, and that output was tied to input by chains 
of reflexes. No one believes that today, and with historical hindsight everyone 
cites Lashley’s (1951) paper on serial order to support the claim. But what 
mechanisms are available? The behaviorist faces theoretical bankruptcy here, for 
reflexes and associations just won’t work. There just is no behaviorist account of 
the genesis of action worth criticizing. Does this mean victory for the cognitive 
theorist? Hardly. The revolution here has been negative in effect: pointing out 
the inadequacy of extant formulations is no substitute for a viable theory of 
action. And the classic cognitive theorists have no theory at all. Neisser does not 
even discuss the topic, and for that matter, neither does anyone else (except 
certain neurophysiological psychologists: see Section 2). 


1.3 Toward a Conceptually Adequate Psychology 


The fact that enterprising behaviorists can make their views consonant with what 
is typically taken to be representative cognitive formulations is not surprising— 
they are indeed an enterprising lot. Nor does it mean that there are not 
thoroughly convincing arguments against behaviorism that make this tiresome 
game of catching up far too costly to play for those who are intellectually 
honest (see Weimer, 1974, and the references therein cited), What it does mean 
is that much so-called cognitive psychology is totally inadequate to the task of 
characterizing the human higher mental processes and their relation to behavior. 
It is time to explicitly adopt, in addition to the philosophical and methodologi- 
cal arguments against behaviorism, an adequate conceptual framework for cogni- 
tive theorizing. In order to effect a principled separation of theoretical cognitive 
psychology from positions such as behaviorism one must abandon the sensory 
metatheory in favor of the motor framework. Such a move will have numerous 
consequences, including providing answers for the problems discussed in Section 
1.2, answers that successfully integrate acting and perceiving, something that has 
never before been done by either behaviorism or cognitive psychology. What the 
motor metatheory asserts is that there is no sharp separation between sensory 
and motor components of the nervous system which can be made on functional 
grounds, and that the mental or cognitive realm is intrinsically motoric, like all 
the nervous system. The mind is intrinsically a motor system, and the sensory 
order by which we are acquainted with external objects as well as ourselves, the 
higher mental processes which construct our common sense and scientific 
knowledge, indeed everything mental, is a product of what are, correctly 
interpreted, constructive motor skills. Let us indicate why this must be so by 
overviewing the physiological basis of action. 
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2 PHYSIOLOGICAL PSYCHOLOGY AND THE ACTIVE MIND 


Recent research in neuropsychology has led to the abandonment of sensory and 
associationist conceptions of mind. By recapping the progressive refutation of 
building block, reflexive conceptions of neural functioning, and the input-out- 
put chaining models of cortical control, we can see why sensory theories of mind 
must be abandoned. 

Neuropsychology originally accepted reflexology as its theoretical model of 
neural functioning. Pioneered by Sechenov in Reflexes of the Brain (1863), it 
was popularized by Sherrington (1906) and taken as established scientific fact 
into behaviorism from the writings (as they were misunderstood by John B. 
Watson) of Pavlov (1927). In this framework all behavior is the result of 
input-output connections between the sensory and motor components of the 
central nervous system (CNS). The motor cortex of the brain was regarded 
variously as a “switchbox” or “piano keyboard” that activated muscular (and 
hence, behavioral) output. The motor cortex was the keyboard upon which the 
sensory mind played to produce behavior. The only question concerned whether 
the keys encoded a representation of individual muscles or represented more 
complex movements and combinations. But more recent research, overviewed by 
Eccles (1973) and Pribram (1971a, b), has devastated this simplistic conception. 
Let us overview the motor control of action, and then look at symbolic activity 
as an instance of brain functioning, to see how this leads to a generative, motoric 
conception of mind. 


2.1 The Motor Control of Action 


The first problem for the discrete input-output reflexological approach came 
with the discovery of the y efferent system. The motoneurons of the pyramidal 
tract are composed of large (a) and small (y) fibers. The a fibers exclusively 
innervate extrafusal (contractile) muscle fibers while the y fibers innervate 
muscle spindles lying in parallel to the ‘extrafusal muscle. When y fibers are 
electrically stimulated, a change in the afferent signals going to the spinal cord 
from the muscle receptors results. One-third of the fibers leaving the spinal cord 
for muscle tissue are y fibers and do not result in muscle contraction, but rather 
influence muscle receptors independently of any changes produced in the 
muscle. Thus, signals from muscle receptors to the brain do not accurately 
teflect the state of contraction of the muscle, but are instead part of a 
servomechanism or servo control loop (Eccles, 1973, p. 115). The action of the 
loop can be biased by the motor cortex over a wide range by discharge of the Y 
fibers, and y fiber activity “presents” muscle contraction much as does the bias 
knob on a thermostat. It is now known that voluntary movements are initiated 
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by « contraction, which results in coactivation of the y loop fibers, which in 
turn gives automatic adjustment to a load (at the spinal level of control). 

When knowledge of the action of the y feedback loop is combined with the 
realization that the motor cortex is informed by sensory nerves from skin and 
muscle through pathways as direct as those by which such signals reach the 
sensory cortex (Pribram, 1971a), whose primary function appears to be to signal 
the cortex the progress of the movement it has just programmed, the reflex 
conception is seen to be untenable. 

But if not by simple reflexes how does the cortex control action? Granting 
that its mechanism is motoric, a servo-loop control system rather than a reflexive 
one, how does the ‘keyboard’ operate? Pribram discovered that ablation of 
motor cortex has little effect on muscle function in particular movements, but 
greatly disrupted skilled action: 

My interpretation of this finding was that behavioral acts, not muscles or movements, 
were encoded in the motor cortex. An act was defined as an achievement in the 
environment that could be accomplished by a variety of movements which became 
equivalent with respect to the achievement. Thus, a problem box could be opened by 
use of a right or left hand; amputees have learned to write with their toes. Encoded in 


the motor cortex are the determinants of problem solution and of writing—not the 
particular movements involved in the performance [Pribram, 1971b, p. 14]. 


But what determinants of action are encoded in the motor cortex? Evarts (1967; 
Evarts & Thach, 1969) found that the motor cortex cells that fired when a 
monkey pushed a lever to and fro were responding not as a function of length of 
movement or stretch of muscle, but rather as a function of the force needed to 
perform the task. The motor cortex represents the field of forces necessary to 
achieve a functionally specified action: 


The brain’s motor mechanism can encode the set points, the information necessary to 
achieve certain acts. The brain need not keep track of the rhythms of contraction and 
relaxation of individual muscles necessary to achieve an act any more than the thermo- 
stat needs to keep track of the turnings on and off of the furnace. The encoding 
poem is immensely simplified—only end states need to be specified [Pribram, 1971b, 
pp. 16-17]. 


These results, along with the classic study of Bernstein (1967) and current 
research (e.g., Greene, 1971) reviewed by Turvey in this volume, are devastating 
to both the reflex model of neural functioning and the sharp distinction of 
sensory and motor components of the mind. Reflexology is ruled out because of 
the feedback loop control of muscle contraction at the periphery, and because 
the keyboard in the cortex controls abstract, functionally specified actions 
rather than particular muscle movements. The cerebellum is also intimately 
concerned with the skilled control of functionally specified movements (Note 
3). But equally important, it becomes obvious that the classic sharp separation 
of sensory and motor components is untenable. The feedback loop is a motor 
component performing a sensory function, and some ‘sensory’ fibers project 
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directly to motor rather than sensory cortex. Further, the information coded in 
the ‘motor’ cortex is quite ‘sensory’ in nature, being concerned with, as Pribram 
puts it, the image of achievement of abstract actions specified in terms of their 
functional consequences. This stretches Aristotelian conceptions of mind to the 
breaking point, and forces a return to the active epistemologist’s conception of 
mind as a generator of not only its output but also its input (Note 4). 


2.2 Sign and Symbol as Motoric Skill 


One may grant the motor theory of mind for behavioral action, but what of the 
higher, symbolic functions characteristic of man’s mental functioning? Can the 
use of signs and symbols really be understood upon the model of actions? 
Following Pribram, I wish to argue that not only can they be so interpreted, 
but that there simply is no alternative to doing so. 

Let us begin by defining signs and symbols. A sign is a token of a context-free 
code. A symbol is a token of a context-sensitive (restricted) code. A code is a 
pattern of repetition. Signs and symbols are representations, tokens that “stand 
for” something else. Symbols are signs that derive their meaning from a particu- 
lar context. Letters on a page are signs, but the meanings that they convey to a 
reader when part of a written natural language are symbolic. The question we 
must consider is: How does the brain control its input (and output) in determin- 
ing signs and symbols? 

As we know from overviewing the metatheory, the traditional sensory answer 
postulates a series of linearly chained processing steps, in which progressively 
more abstract features are extracted (recall the quotation from Shiffrin in 
Section 1.1). The simpler features are supposed to be abstracted (via the 
subtractive theory of abstraction: see Cassirer, 1953) at the receptors them- 
selves, then more abstract features at the intermediate way stations and “pri- 
mary projection areas” of the cortex. The very complex information necessary 
to utilize signs (say in discrimination learning or delayed response tasks) was 
assumed to involve “association” areas of the cortex immediately after the 
projection areas, according to the sequence in Fig. 1. If this linear processing 
model were correct, then destruction of connecting tissue separating the primary 
projection areas from the relevant “association” areas should impair signing 
ability. Considerable evidence (reviewed in Pribram, 1971a) indicates that this is 
not the case, and that the classic picture is incorrect. Cortico-cortical relays are 
not the mechanism of signing and symbolizing (at least in nonhuman primates). 

The evidence available indicates that information from the cortex is trans- 
mitted “downstream” to subcortical structures for preprocessing, and then 
relayed to the primary sensory and motor cortex. Instead of the classic view that 
perception occurs in the primary sensory cortex and that “what is learned” is 
then added by the association of input from the association cortex, in the case 
of vision, for example, “the inferior temporal cortex influences visual processes 
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Cortico—cortico connections 
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FIG.1 Diagram of classic conception of (visual) perceptual abstraction and integration. 
“Higher” perceptual abilities were assumed to involve successive abstraction of information 
from preceding areas, such that, for example, pattern perception involved cortico-cortico 
connections. 


not so much because it receives visual information from the primary cortex, but 
because it operates through corticofugal connections on visual processes Oc- 
curring in the subcortical structures [Pribram, 1971a, p. 318] .” Motor involve- 
ment indispensibly enters at this point: “Sign and symbol manipulation thus 
involves the same brain structures that are used by the organism in the construc- 
tion of acts. The suggestion that derives from these anatomical facts is that 
signifying and symbolizing are acts, albeit acts of a special sort [Pribram, 1971a, 
p.22].” 

The structures implicated in signification and symbolization are anatomically 
and functionally somewhat distinct, which results in reasonably specific postula- 
tions concerning their motoric nature and involvement (Note 5). But for our 
purposes, the point remains: not only are higher, symbolic activities intrinsically 
motoric, but they can be conceived as actions, skilled activities of the CNS, 
analogously to the way in which skilled movement is conceived. [They are acts 
of a special sort (as in the quotation above) only in that they are mental and 
symbolic.] 

Signification and symbolic activity, which involve the creation of meaning, 
appear to occur when cortico-subcortical connections are activated, and these 
connections relay in brain stem structures implicated in motoric function (see 
Fig. 3 in Note 5). Thus Pribram proposes a “motor theory” underlying meaning: 


Man makes meaning through signs. symbols, propositions and reasoning by way of 
corticofugal-subcortical connections that importantly involve the motor mechanisms of 
the brain. I propose that man’s thrust toward meaning derives from the fact that his 
oa motor mechanisms are better developed than those of animals [Pribram, 1971b, 
p- 30]. 


We shall have more to say about meaning in Section 8. But to repeat the point 
here, the motor mechanism of the brain cannot be narrowly identified with the 
control of muscle tissue. Although the distinction between sensory systems and 
muscle effector systems can be made on well-motivated taxonomic grounds, this 
distinction cannot be made at the functional, neurophysiological level. Instead, 
the evidence just sketched indicates that both classic sensory systems and muscle 
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effector systems are intrinsically motoric in nature. If this is so, there is simply 
no alternative presently available to the generic motor theory of mind. Every- 
thing man can perform, perceive, and conceive results from his CNS being a 
generator, a creator, of both its sensory input and all its behavioral and 
conceptual output. The motor theory conceptual framework thus postulates 
that sensation, feeling and emotion, perception, conception, and action are 
motor functions, and that everything that is properly called mind falls into one 
or more of these categories. 


3 COGNITION AS SKILLED ACTION 


Granting the generic framework, the problem of specifying motor theories of 
particular cognitive processes remains. A review of theoretical formulations of 
specific cognitive domains will indicate how the metatheory relates to specific 
psychological and philosophical theories. This section looks in cursory fashion at 
topics recurring throughout this book, that is, at thought, concept formation, 
memory, and knowledge itself (both tacit and explicit) to see what motor 
theories of these domains propose; in Section 4 the problems of perception will 
be considered; the sensory order occupies Section 5; Section 6 examines the 
nature of consciousness, and Section 7 the problem of motivation; finally 
Section 8 focuses on the problem of meaning. 


3.1 Thought as Skilled ‘Behavior’ 


On one point, cognitive and behavioristic formulations agree: thinking is to be 
understood as behavior. Although classic behaviorist formulations attempted to 
reduce thought to ‘overt’ behavior by modeling it upon the peripheral muscula- 
ture (Note 6), it was always deemed the epitome of skilled performance. 
Cognitive analyses are in complete agreement on this latter point: 


When we say, . . . that thinking is an advanced form of skilled behavior, what we mean is 
that it has grown out of earlier established forms of flexible adaptation to the environ- 
ment and that the characteristics which it possesses and the conditions to which it 
submits can best be studied as they are related to those of its own earlier forms 


[Bartlett, 1958, p. 199]. 


Bartlett’s book was a beginning attempt to study thinking, particularly as it is 
exemplified in problem solving, from this point of view. But man the thinking, 
cultural creature that Cassirer aptly called Homo symbolicus is not merely a 
smart animal. Rather he is qualitatively different in that the fantastic power of 
thought (to be creative or productive and to transcend time) has enabled man to 
achieve emancipation from the various bodily skills. As Bartlett emphasized, 
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thinking is an adaptive skill, but it is qualitatively advanced beyond bodily skills: 


The skills of the body are attached to the demands of the world of the moment, but 
thinking can meet those far away in any direction and even those which have no time or 
place. And because, now, body movement is supplemented by signs and symbols—all the 
varied media of its expression which thinking can employ—the thinker is in a position to 
attain a vast increase in the range and delicacy of his adaptability [Bartlett, 1958, p. 
200). 


But little if any effort has been devoted to following up Bartlett’s lead: it is time 
for psychology to transcend the behaviorist attempt to reduce thinking to bodily 
skill, to see just how it is similar and how it is different from other skills, and in 
which directions the uniqueness of man lies (Note 7). 


3.2 Concept Formation and Creativity 


Problems of learning, specifically the issues of what is learned and under what 
conditions, have dominated theoretical psychology since the days of Watsonian 
behaviorism. Traditionally the answers proffered have been particular behavior 
patterns (responses or concept identifications, and so on) as a result of rein- 
forced experience with particular stimulus instances. Thus it was confidently 
asserted that the child learned what triangularity means by playing with dif- 
ferent triangles (and less confidently but equally explicitly assumed that the 
adult did likewise with notions such as freedom and dignity), and that what was 
involved in all such instances were the same principles already well known from 
basic animal learning studies. Learning was a matter of identifying (really, 
subtracting out) the relevant dimensions of stimulus configurations and then 
attaching a response to them. 

But the resurgence of cognitive psychology uncovered a fact, known to all but 
psychologists since the time of Plato, that destroyed this account by showing 
that the mechanism involved could not be either subtractive or an association (in 
any technically precise sense) of particular stimuli and responses; that is, it was 
rediscovered that all concept formation is creative or productive. The number of 
experienced particulars required to learn even the simplest concept (e.g., that A 
instantiates triangularity) is indefinitely large. To know a concept is to be able to 
apply it appropriately to a totally novel instance, one that need not ever have 
occurred before in the organism’s history, or indeed the history of the world 
(Weimer, 1973). The problem of productivity, which destroyed the structuralist 
approach in linguistics, is equally devastating to any associationist or behaviorist 
psychology which does not make a mockery of the meaning of those terms. 

But how can organisms come to recognize, on the basis of no prior exposure, 
all potential instances as instances of the same concept? The theories currently 
being developed in psychology are modeled upon the structure of theories in 
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linguistics that account for the productivity of sentences. Such grammars make 
indispensible use of fundamentally abstract entities (which are nowhere found in 
the experienced or surface structures of sentences) in their rules of determina- 
tion of the construction of sentences, and allow indefinite recursion in their 
derivation of surface instances from underlying or “deep” conceptual structures. 
All generic concepts appear to be the same as an underlying deep structure in 
that they can be characterized only by the abstract rules of determination which 
are their grammar. 

What is learned in concept formation is thus never purely perceptual nor 
particular; it is instead the rules of determination that constitute the invariant 
relations in a group (in the mathematical sense). Our “perceptual” knowledge is 
based upon “rules of seeing” that determine the invariant aspects of our 
environment and the group theoretic transformations applicable to them. But no 
sensory experience alone is identical with any concept: what the knowledge or 
concept is consists of both the underlying rules of determination and the surface 
representation that is experienced. 

Thus concept formation, since it is creative, must be abstract, generative, and 
nonperceptual. In these attributes concept formation exemplifies motor theo- 
retic principles in direct opposition to sensory formulations, which have postu- 
lated that concept formation is concrete, associative, and perceptual. The 
realization that concepts embody motor theoretic principles of determination 
has more often occurred to philosophers than psychologists. Psychologists 
should study the writings of Cassirer (1953) to see that a thorough treatment of 
the abstract nature of conceptual determination was available in 1910, Hanson 
(1970) to see a beautiful portrayal of the nonperceptual nature of concept 
formation in science, and Kuhn (1974) to see the generative nature of learning 
from exemplars. We have dealt with these aspects of conceptual abstraction 
more fully in the philosophical context of the problems of induction (Weimer, 
1975), and that discussion should amplify this brief section. 


3.3 Memory and Comprehension within Constructive Theory 


Similar conclusions apply to memory as to concept formation. The theory of 
memory developed by Bransford and his associates (e.g., Bransford & Franks, 
1971; Bransford & Johnson, 1972; Bransford & McCarrell, 1974), emphasizing 
the mind’s ability to utilize the information with which it is presented by 
generating contextual understanding rather than storing particulars, is a para- 
digmatic motor theory. Within this framework “memory” is a matter of the 
active, ongoing modulation of information rather than the retrieval of stored 
items or particular “bits.” This approach claims that human understanding is a 
matter of the mind’s ability to generate structural patterns of information which 
are meaningful and meaning-filled conceptual gestalts to which new information 
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can be assimilated. Indeed, the very act of assimilation, what we often simply 
call “thinking,” is a motoric skill, and that is understanding or comprehending: 


Our approach to comprehension focuses on the comprehender’s ability to use his general 
knowledge to create situations that permit the relations specified in input sentences to 
be realized, or to postulate situations (e.g., instigating forces) that allow perceptual 
events to be understood. In short, the ability to create some level of semantic content 
sufficient to achieve a click of comprehension depends upon the comprehender’s ability 
to think (Bransford & McCarrell, 1974, p. 220). 


3.4 Knowledge and Genetic Epistemology 


These remarks on cognition make it clear that the processes underlying human 
knowledge are (pace Bartlett) skilled actions. But what of knowledge per se? 
Piaget’s approach to the acquisition of knowledge, what he calls genetic epis- 
temology, is one example of a motor theoretic approach to knowledge (Note 8). 
Piaget holds that the child comes to know the world and remember its objects in 
terms of actions upon that world, rather than in terms of passively noticing 
relations among objects per se. Indeed, the relations between objects (which is 
all that the sensory metatheory acknowledges as knowledge) can only become 
objects-of-knowledge through active participation and dynamic interchange with 
those objects. The only relations among objects that can be known are those 
resulting from the way in which we behave with respect to those objects. And all 
behavior, as Piaget emphasizes, is a matter of (biological) adaptation, comprised 
equally and indispensibly of assimilation and accommodation. Knowing (within 
this framework) is adapting, and adapting is actively behaving: to know is to do. 


3.5 Tacit Knowledge and the Unconscious Mind 


Although conceptions of the unconscious have a long history in psychological 
speculation (e.g., Leibniz, Hartley, and Helmholtz all made the notion indis- 
pensible to their systems well prior to Freud), such notions are really highly 
problematic within the sensory metatheory. That the ‘unconscious mind’ must 
be literally a contradiction in terms for sensory theorists becomes obvious when 
the identification of mind with the alleged sensory component is acknowledged. 
In that framework unconscious mental processes are unsensed sensory processes, 
and they can occur, by definition, only below the cortical “level” of control. 
Thus, classic conceptions have identified the unconscious with the subconscious, 
which in turn was identified with the subcortical. The popular connotations 
of the Freudian id demonstrate this progression nicely, with the additional 
twist that rationality was identified with the conscious-cortical level and irra- 
tionality with all that goes on “below.” But the point remains: “sensory” 
conceptions of the unconscious sharply divide “higher” cortical processes 
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from “lower” processes and relegate all unconscious phenomena to the lower 
processes, and they do so in order to avoid glaring contradiction. 

But as is well known today, tacit or unconscious processes are inextricably 
involved in all our affairs, whether sensory or motor, conscious or otherwise 
(Polanyi, 1966; Turvey, 1974; Weimer, 1973, 1974, 1976). For an example in a 
traditional sensory domain, consider the problem of perceptual thing-kind 
identification: How does one know a face, say, to be an instance of a face? In 
traditional motor terms, the process of writing a sentence is a mundane, 
although far from insignificant, example. No explicit knowledge or ability is 
sufficient to account for even these ‘trivial’ instances of tacit awareness and skill, 
and all the fundamental problems of psychology (knowledge, action, percep- 
tion, conception, conation, and all the rest) are manifestations of such tacit 
processes. Perhaps the most that one can presently say is that the problems 
of tacit knowledge (thus construed) are the problems of meaning (see Sec- 
tion 8). 

We have no theories, even on the farthest horizon, that account for the 
problems of the unconscious represented in tacit awareness and skill. But we 
need not be saddled with a conceptual framework that renders no solutions 
possible. From the motor theory framework one can at least attempt to assay 
what is involved without running into contradiction and inherent confusion. 
Indeed, from this perspective the long delay in recognizing that unconscious or 
tacit processing precedes all conscious or explicit behavior is solely a product of 
the sensory metatheory: the shock wave caused by Polanyi (e.g., 1966) in 
philosophy and, for example, by Turvey (1974) in psychology results largely 
from a realization of the existence of pervasive problems that were literally not 
acknowledged to exist by sensory theorists. The problem for the future psy- 
chology of knowledge and action appears somewhat paradoxical: we require a 
theory of tacit knowing and doing that will also account for the “little bit” of 
conscious or explicit phenomena that we sometimes exhibit. 

Is the appearance of paradox genuine or spurious? Granting the motor meta- 
theory and the physiological evidence sketched in Section 2, it is not the least 
surprising that no sharp break can be maintained between either conscious-un- 
conscious or cortical-subcortical. The CNS is an integrated entity, a coalition in 
Shaw’s terms (Shaw & McIntyre, 1974), and from the functional rather than 
anatomical perspective it is an integrated unit. Even the extreme localization of 
consciousness within the language (dominant) hemisphere discussed in Section 
6.2 requires that the vast majority of neural events be tacit. And if consciousness 
is dependent upon involvement within a train of neural impulses in one hemi- 
sphere, then it is obvious that all other neural activity, including the vast 
est? mental processes, will remain tacit. 


majority involved in even the ‘high : 
Therefore, our paradoxical statement, although a severe shock to the conscious 


ego, is actually to be expected. 
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4 PERCEPTION AS A MOTOR SYSTEM 


We must turn now from overviews to more substantive theoretical formulations 
in order to give both evidential flesh and specific theoretical bones to motor 
theories. Consider first the domain of perception. The conscious experience of 
perception does not at first seem to lend itself to interpretation within the 
motor theory. Once again we wish to argue not only that it is susceptible to such 
interpretation, but that the motor theory provides the only framework capable 
of accounting for perception. The senses, considered as perceptual systems, are 
motor systems through and through. 

This section considers the conscious experience of perception; the problem of 
modality, which concerns why we have experience at all, is treated in Section 5. 
What we require for perceptual experience is a theory of the representation of 
information within the given modal systems; that is, the major problem for a 
theory of perceptual experience is to account for how information within a 
modal system is utilized by the organism to create the experience that it has. 
Since we can only infer but not directly assess experience in other animals, the 
data for this section must come from human perceptual experience. Specifically, 
we shall examine the perception of speech and visual form to see how motor 
theories developed for these areas are representative of accounts that can handle 
all perception. In a nutshell, the motor theory of perception holds that experi- 
ence within a modality results from the involvement of the same structural and 
functional neurophysiological apparatus as is involved in producing (as a sender 
of information rather than a receiver) that information. The account has face 
validity for the ‘near’ senses such as touch, and the perception of speech; once it 
has been developed for the latter domain we must see if it can be stretched to 
encompass the distance senses of vision and audition. 


4.1 A Motor Theory of Phonological Experience 


The phonological-articulatory mode of constructing experience has been studied 
intensively for the last two decades. Largely as a result of the Haskins Labora- 
tories researches, it has become clear that this mode of experience is fundamen- 
tally different from audition, and deserves to be recognized as a separate speech 
mode (Liberman, 1959; Liberman, Cooper, Shankweiler, & Studdert-Kennedy, 
1967; Liberman, Mattingly, & Turvey, 1972). The motor theory of speech 
begins by assuming (on the basis of considerable research) that: “Underlying the 
experience of language at the phonological level, in every form of language use, 
are processes which are basically identical to the motor coordination processes 
involved in speaking, with the motor commands being in some way inhibited in 
all cases but that of overt speech [Halwes & Wire, 1974, p. 386; italics theirs] .” 
What the theory proposes is that our understanding of speech as such requires 
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the hearer to utilize the same central neural machinery that would be involved in 
speaking: “The assumption is that at some level or levels of the production 
process there exist neural signals standing in one-to-one correspondence with the 
various segments of the language—phoneme, word, phrase, etc. Perception con- 
sists in somehow running the process backward ... [Liberman er al., 1967, p. 
454].” Thus in both perception and production we construct or generate the 
conscious experience of speech. 

It is worth noting that the motor theory of speech perception has nothing to 
do with so-called analysis by synthesis. Such routines propose to understand the 
operation of a “black box” (such as a speaker-hearer) by constructing a model 
that “outputs” what a priori analysis of the black box specifies as its output; 
that is, analysis by synthesis performs an analysis and then synthesizes the 
output determined by the analysis. The motor theory does not say that speech 
perception is a matter of analysis by synthesis; to propose so would be to assert 
that the perception of speech involved a totally redundant operation (of synthe- 
sis). Rather, the motor theory proposes that the process of analysis (which is 
speech perception) is actually synthesis; that is, a constructive motoric process. 
The evidence in favor of this conception, which is extensive, is indicated in 
Halwes and Wire (1974). 

This general account can be directly extended to explain how we perceive any 
skilled action that we ourselves can perform. Therefore, touch, taste, the 
experienced aspects of locomotion through the environment, and so on, pre- 
sumably involve the motor command systems that are utilized in producing the 
movements of touching, tasting, and locomoting; that is, these perceptual 
systems are in fact motor skills production systems: this is the motor theory in 
its essential form and in its strongest claim. In these cases it is easy to construe 
the systems involved as intrinsically production systems; but the case of vision, 
for example, requires more comment (see Section 4.3). But in all cases, to say 
that the classic ‘sensory’ systems are actually ‘motor’ systems is to assert that 
sensing is an activity involving neural structures traditionally assumed to be 
inherently motoric in function. Therefore the representation of information in 
the sensory systems, which is perception, is a productive motor function. 


4.2 Pattern Recognition and the Höffding Step 


The heart of the problem of perception is easily stated: granted the sophisticated 
analysis of input information that occurs when nonmetal objects (either “ex 
ternal” objects or our own physical bodies) produce patterns of firing in 
‘sensory’ channels, how do we get from such coding of input information to 
perceptual understanding? How do we get from patterns of neural impulses in 
the sensory orders to recognition of objects? This, of course, is the problem of 
thing-kind identification, which concerns the recognition of an X as such on the 
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basis of the neural impulses causally resulting from X. Granting the causal theory 
of perception, how do we get from mediate causal linkages (sensory impulses in 
the CNS) to the ultimate result of the causal chain, perceptual understanding? 

Harald Hoffding (1891) called attention to this problem in arguing against 
associationist chaining theories. His analysis shows clearly that the linkage 
between two “associated” elements, say physical object A and perceptual 
understanding (naming) B, is never direct or linear, as associationism proposed. 
Rather, the linkage is always of the form shown in Fig. 2. The letter “a” 
represents analysis of the input in terms of thing-kind categories, and “b” the 
corresponding “proper part” neural component of understanding that must 
occur before the response “B” (either overtly or covertly naming the object) can 
occur. The problem of perceptual understanding is the problem of the Höffding 
step: How do we get from “a” to “B”? How do we achieve meaning in 
perception? 

No motor theory yet discussed (or for that matter, any psychological theory 
ever imagined) makes explicit what is involved in the Höffding step; that is, how 
the motor involvement in perception is to accomplish the jump from sensation 
to perception, from analyzed information coded in neural impulses to meaning 
and recognition, has not been specified. 

But if the perceptual systems can plausibly be interpreted as motor production 
systems, it is possible that this problem does not arise. Consider the case of 
speech perception if the account sketched in Section 4.1 is correct: 


Assume that the auditory system extracts certain structures from the input acoustic 
signal, which are just those structures it has been evolutionarily “designed” to extract in 
order to permit the recognition of speech. Now, on Pribram’s analysis of the nature of 
the motor cortex, it is at least possible that the Höffding step problem simply disap- 
pears. Presumably those invariant structures which are linguistically relevant on the 
input side are the same structures as those which are linguistically relevant on the output 
side, in the coordination of the articulatory muscles, for there is a strict isomorphism 
between the linguistically relevant aspects of the acoustic signal and the linguistically 
relevant aspects of the articulatory gestures, the former being produced directly by the 
latter. Moreover, the neural machinery for processing the acoustic signal evolved to- 
gether with the neural machinery for controlling articulation. So the recoding problem, 
the problem of getting from acoustic signal to articulation, dissolves and we are left with 
the relatively simple task of moving the information specifying the input structures into 
the motor cortex [Halwes & Wire, 1974, pp- 387-388]. 
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FIG.2 The Höffding step in the process of pattern recognition or perception. 
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This line of reasoning solves the problem of recognition in perception by 
assimilating it to the problem of meaning in production. Thus the traditional 
coding problem, the problem of the Höffding step, vanishes as a pseudoproblem. 
But the problem of meaning or understanding does not vanish in this manner: it 
is only relocated in (if this approach is correct) its actual position. This approach 
tells us how meaning is manifested (in terms of motor involvement), but it does 
not tell us what meaning is. 

This discussion has one benefit in any case: it forces a clear distinction 
between two different kinds of questions that may be asked in perception. One 
question, the only one pursued by the majority of cognitive psychologists, has to 
do with the “How” of perception, and concentrates upon the constructive 
processes which achieve our perceptual experience. But the question of how we 
perceive can only be addressed by presupposing knowledge of a second question, 
concerning the “What” of perception. Gibson (1950, 1966) and his associates 
have devoted much attention to specifying what the information is that is 
perceived, and have elaborated a position that does not emphasize constructive 
processes, at least in visual perception. Gibson’s approach, Ecological optics, 
argues that all the information necessary to account for perception is present in 
the ambient optic array, and that when the “What” is correctly specified, there 
is no need for any coding such as the Hoffding step. There is no need, for 
instance, to involve memory or reconstructive processes in perception if the 
information specifying perceptual experience is present in the stimulus. The 
problem instead is to explain how the senses, considered as perceptual systems, 
modulate information. But the modulation of information that Gibson envisages 
is a transduction of incoming energy rather than a translation (or encoding), and 
this has led some theorists to propose that constructive theory, which apparently 
does deal with coding, be limited to linguistic systems (Note 9). 

But as the Halwes and Wire reasoning makes clear, there is no need for the 
Höffding step in the constructive domain of language either. The opposition 
between the Gibsonian and constructive positions is, we believe, based on 
misconceptions on both sides. The constructivist must realize that no construc- 
tion can occur in any mental realm unless there is information available upon 
which constructive processes operate. The slogan “Everything is a construction’ 
is a contradiction in terms: unless there is an informational basis available, no 
construction can occur. Similarly, the Gibsonian must realize that pointing out 
that the information necessary to account for perception is specified in the optic 
array (or the speech signal, etc.) says nothing about how we perceive. An 
adequate theory of perception must include not only a specification of what is 
perceived but also an account of how it occurs. Further, if it is to be a theory of 
organismic perception, it must be consonant with a specification of “Who 
perceives, that is, a theory of the ‘algoristic basis’ (as Shaw calls it; see Shaw & 
McIntyre, 1974). 
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Rather than pursue a conflict of misunderstandings, let us examine visual 
perception from the point of view of a motor theory. We can agree with both 
Gibson that the “what” of perception is specified by information in the stimulus 
array, and with the constructivist that the problem is to specify how the mind 
turns that information into perceptual experience. 


4.3 Vision as a Motor System 


Vision is prima facie problematic for a motor theory because it is a distance 
sense in which no obvious motor productions are involved. Unlike speech, we do 
not ‘produce’ vision the way we produce, as well as perceive, speech, body 
postures, and so on (an exception is imagery: see Section 4.4). Therefore, the 
conscious experience of visual phenomena requires our attention. Both Sperry 
(1952) and Festinger (Festinger, Burnham, Ono, & Bamber, 1967) have pro- 
posed motor theories of vision; we shall overview the latter account, which 
places efference, or motor commands, at the center of visual form perception. 

If discussion were confined to perception of motor movements, there would 
be no problem in specifying a motor theory of perception. But efferent instruc- 
tions issued by the CNS to the eye seem to play no role in visual perception. For 
instance, we do not have to “follow” a contour with eye movements to perceive 
the contour. Therefore we must retreat one step and speak of “readiness” to 
issue motor commands to the eye as being involved in conscious perceptual 
experience. Sperry suggests the following: 

If there be any objectively demonstrable fact about perception .. . it is the following: 

Insofar as an organism perceives a given object, it is prepared to respond with reference 

to it. This preparation-to-respond is absent in an organism that has failed to perceive. . . . 

The presence or absence of adaptive reaction potentialities of this sort, ready to 


discharge into motor patterns, makes the difference between perceiving and not per- 
ceiving [Sperry, 1952, p. 301]. 


But how, experimentally, could one assay anything as tenuous as a “readiness 
to respond”? Festinger provided the answer: “If, without changing anything 
about the pattern of retinal stimulation, one could alter the particular prepro- 
grammed sets of efferent instructions that were activated and held in readiness 
for immediate use, one would expect to produce a change in visual perception 
[Festinger et al, 1967, p. 14].” These investigators performed a series of 
experiments with prism spectacles and contact lenses to test an “efferent 
readiness” theory of visual form (contour) perception which they state as “this 
theory, . . . holds that visual perception of contour is determined by the particu- 
lar sets of preprogrammed efferent instructions that are activated by the visual 
input into a state of readiness for immediate use [Festinger et al., 1967, p. 34].” 

It is crucial to note that this is a theory of the conscious experience of 
perception. Visual input per se is acknowledged, but not given the primary role 
in determining experience. Rather, the theory states that experience results from 
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the motor commands activated into a state of readiness by the CNS upon receipt 
of the incoming visual information. It is thus a constructive theory of visual 
experience, since the contour is not “seen” in terms of the input at all, but 
rather in terms of the efferent commands issued in response to that input: 
perceptual experience could not occur without input to the visual system (and as 
Gibson states, that information is sufficient for perception), but that input is not 
the determinant of the ensuing perceptual experience. 

The data support Festinger’s efferent readiness theory in the four experimental 
paradigms he examined. Perhaps the most interesting case involved mounting a 
distorting prism directly on the eye via a contact lens. In this case, the subjects’ 
objective eye movements must conform to the contour to maintain fixation. In 
this instance: 

The old well-learned efference for an eye movement to fixate that is activated by visual 

input will result in a loss of fixation. To move the eye and maintain fixation along the 

contour, S wearing the contact lens must learn a new set of efferent instructions to issue 
in response to the visual input. If the conscious experience of visual perception of 
contour is, indeed, determined by efferent readiness activated by the visual input, then 
to the extent that S learns a new afferent-efferent association and, hence, a different 
efferent readiness is activated by the visual input, he will have a different visual 


perception of the contour. ; 
In accordance with these theoretical expectations all three Ss showed appreciable 


change in the visual perception of curvature as a consequence of simply scanning the line 
while wearing the contact lens. This occurred whether S viewed an apparently straight 
line or an apparently curved line. Further evidence suggests that there is appreciable, 
perhaps complete, interocular transfer of this change in perception of contour. The data 
also provide a hint that, if the eye movement involved is a smooth tracking movement, 
there is less change in visual perception than if the eye movements are saccadic 


[Festinger et al., 1967, p. 35]. 


Subjects in this condition did experience changed perception of the contour, 
regardless of whether they were viewing an apparently straight or an apparently 
curved line. 

Thus, there is evidence that even such a prima facie unlikely instance as visual 
form perception is susceptible to interpretation by a motor theory. (Presumably 
a similar account can be detailed to handle the phenomena of auditory percep- 
tion, although none has yet been worked out.) Indeed, since no other (sensory) 
theory yet proposed can address data such as Festinger’s, it appears that the 
motor theory approach is virtually without rivals in the domain of perception, at 
least with regard to the “how” questions involved. 


4.4 Imagery and Hallucination as Motor Constructions 


Further evidence for the motoric nature of perception comes from the construc- 
tive processes underlying imagery and hallucination (rather, processes which are 
images and hallucinations). Despite Gibson’s correct insistence that the informa- 
tion specifying perception is available in the stimulus array, this specification of 
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what is involved does not address the processes by which perception occurs. The 
“what” question is being dealt with beautifully by Gibson and his associates, but 
the “how” question remains. Constructive theory in cognitive psychology de- 
veloped as an attempt to deal with this latter question. Let us examine the 
constructive account of imagery, to see the motoric nature of such processes, 
and to see that constructive theory does not make perception an illusory 
process. 
The motor theory of perception has often been criticized in this manner: 
Such views maintain that perception per se is more a reflection of the response patterns 
instigated in the brain by an input than it is a resultant of the input patterns. This 
appears an extreme view—taken to its logical and absurd conclusion it would mean that 
we would perceive every woman on the street as Aphrodite (and in whatever state of 
dress or undress we are set to see her at the moment) and every man as Adonis. There is, 
of course, some truth to this—“beauty” is to a large extent in the eye of the beholder. 
Yet the distortions of perception that are ordinarily possible are limited. When they go 
beyond these limits we speak of them as illusions and hallucinations. And though any 
account of the perceptual process must take into consideration the possibility of the 
production of illusions and hallucinations, the account need not make illusions and 
hallucination the perceptual way of life [Pribram, 1971a, pp. 91-92]. 


Such criticism is fundamentally misguided, because the motor theory asserts that 
the processes underlying perception are identical to those underlying hallucina- 
tion and imagination, but it does not assert that perception is always illusory. 
The problem of the veridicality of perception, its degree of adequacy of 
representation of, or correspondence to, the nonmental realm is orthogonal to 
the problem of explaining the nature of perception. In this latter regard, 
perception and hallucination are on a par. We can equally speak of “hal- 
lucinating what is really there” and “perceiving what isn’t there.” What we 
require is a theory of the constructive processes underlying perception that 
accounts for the fact that perception and hallucination are different areas of a 
continuum, rather than antagonistic or unrelated processes. (The problem of 
truth, and the adequacy of representation provided by perceptual construction, 
is a ‘philosophical’ issue quite different from the nature of such construction.) 
Motor theories have predominated in the field of imagery since the beginning 
of academic psychology. Alexander Bain (1855) proposed an explicit motor 
theory of imagery, as well as a motor theory of thought. William James did 
likewise in the Principles in 1890. Wundt, the founder of psychology, abolished 
the distinction between sensation and image: “There is absolutely no reason why 
a sensation—blue, green, yellow, or what not—should be one thing when it is 
accompanied simply by an excitation in the ‘visual center’ of the cortex, and 
another and quite different thing when this excitation is itself set up by an 
operation of some external stimulus [Wundt, 1912, p. 14].” Two lines of 
research have shown Wundt’s reasoning to be correct. The first, and most 
familiar, shows ‘perception’ in the absence of external stimulus input, that is, 
shows that we construct images we are unaware to be images. The tachistoscope, 
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with a blank stimulus card and instructions to “tell me what you see” is the 
usual vehicle for such studies (Eriksen, 1958; Goldiamond & Hawkins, 1958; 
Neisser, 1967), although Warren (1970) shows the same effect in speech percep- 
tion when a subject ‘hears, for example, “to” in “go to the store,” when 
nothing but white noise is actually present. The second line of research stems 
from Perky (1910), a student of Titchner’s, and shows that we can equally easily 
mistake perception for images. The “Perky effect” is the confusion of percep- 
tion with imagination. If a subject is told to fixate a spot and imagine, say, a 
banana, he will believe that a picture of a banana projected on the spot is his 
image, despite its orientation, color, and so on being different from his initial 
“image.” The data indicate therefore that there is no difference in kind between 
the experience of perception and imagination (Segal, 1971). Experientially it 
makes no difference whether we perceive or hallucinate. This is all but unintelligi- 
ble unless both processes are constructive, motoric actions in which experience is 
determined by central neural commands or expectations rather than the actual 
sensory input. 

There is also evidence to indicate that the same ‘motor’ is involved in both 
perception and hallucination. Studies of eye movements during dreaming, 
auditory imagery during reading, emg recording during imagined action, and so 
on, all show action patterns virtually indistinguishable from their overt cor- 
relates. Brooks (1968) shows that engaging imagery in a modality delays overt 
responding in that mode, but not in another, suggesting that imagery and 
responding occupy common neural channels. Segal and Fusella (1970) show that 
visual imagery decreases the ability to detect visual signals, suggesting that 
imagery and perception also occupy common channels. 

All this research, in combination with enormous amounts of data indicating 
the constructive nature of perception (Gregory, 1966; Lindsay & Norman, 1972, 
Neisser, 1967), can easily be made sense of by construing perception, imagina- 
tion, and hallucination as constructive processes within the motor theory frame- 
work. No other theory has yet successfully reconciled the indistinguishability of 
these processes, or been able to address data such as that of Festinger and his 
co-workers, or of Penfield (Penfield & Roberts, 1959) which shows imagery to 
result from direct stimulation of certain areas of the cortex. 


5 THE SENSORY ORDER AND THE PROBLEM OF MODALITY 


Psychologists rarely deal with what must be the most obvious and fundamental 
at our phenomenal experience has the 


problem posed by perception: the fact th 
qualitative dimensions that it does. Hayek (1952) stated the problem that we 
face with unparalleled clarity: 


What psychology has to explain is not something known solely through that special 
g which we experience whenever we 


technique known as “introspection,” but somethin; 
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learn anything about the external world and through which indeed we know about the 
external world; and which yet has no place in our scientific picture of the external 
world and is in no way explained by the sciences dealing with the external world: 
Qualities. Whenever we study qualitative differences between experiences, we are 
studying mental and not physical events, and much that we believe to know about the 
external world is, in fact, knowledge about ourselves. 

It is thus the existence of an order of sensory qualities and not a reproduction of 
qualities existing outside the perceiving mind which is the basic problem raised by all 
mental events. Psychology must concern itself, in other words, with those aspects of 
what we naively regard as the external world which find no place in the account of that 
world which the physical sciences give us (Hayek, 1952, pp. 6-7].* 


This is the problem of modality: Why should sights be sights, sounds be sounds, 
speech be speech, and so on? Within a given sensory order the same issue arises: 
Why should red, say, be what we occurrently know as red? The problem, in 
short, is the philosophical one of “raw feels”: to account for the qualitative 
dimensions of perception that we know by acquaintance in human experience. 
How and why are the sensory orders what they in fact are? 

Why the sensory orders have the particular qualities that they do is beyond the 
limits of human inquiry; but how the modalities are functionally differentiated 
can be sketched with a fair degree of confidence. When elaborated, the account 
reinforces the motor theory of mind by dealing with an area that is perhaps the 
least obvious domain to which it could apply. 

In order to tackle this problem at all it is necessary to point out once again 
that the ‘absolute’ nature of sensory qualities cannot be a problem for theo- 
retical inquiry. The problems we can formulate are immediately cast in the 
language of description rather than acquaintance. All that can be communicated 
are the differences between sensory qualities, and such differences do not enable 
us to assert that the qualities different people experience are similar in any 
“absolute” sense. All we can discuss is whether different sensory qualities differ 
in the same way for different individuals. The intrinsic nature of raw feels is 
simply not a proper subject matter for conceptual thought, and therefore is 
beyond the bounds of both science and common sense. As Hayek put it: 


There are no questions which we can intelligibly ask about sensory qualities which could 
not also conceivably become a problem to a person who has not himself experienced the 
particular qualities but knows them only from the descriptions given to him by 
others.... Nothing can become a problem about sensory qualities which cannot in 
principle also be described in words; and such description in words will always have to 
be description in terms of the relation of the quality in question to other sensory 
qualities [Hayek, 1952, p. 31] (Note 10). 


With this proviso restated, let us trace the historical development of thought 
on the sensory order. We may begin with the extremely influential position of 
Johannes Müller, presented in the 5th volume of his Handbook of Physiology in 


*From F. Hayek, The Sensory Order. Chicago: University of Chicago Press, 1952 Copy- 
tight © 1952 by the University of Chicago Press. 
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the 1830s. Müller propounded the doctrine of the specific energies of nerve 
substances: “Sensation consists in the sensorium’s receiving .. . a knowledge of 
certain qualities . . . of the nerves of sense themselves; and these qualities of the 
nerves of sense are in all different, the nerve of each having its own particular 
quality or energy [quoted in Boring, 1950, p. 84].” This doctrine, which merely 
codified prevalent thought among physiologists, attributes the qualities of the 
sensory modalities to the characteristics of the nerves that compose those 
modalities: “The nerve of each sense seems to be capable of one determinate 
kind of sensation only, and not of those proper to the other organs of sense; 
hence, one nerve of sense cannot take the place and perform the function of the 
nerve of another sense [quoted in Boring, 1950, p. 84].” 

Miiller’s pupil, Helmholtz, studying audition, extended the theory to account 
for differences within each modality by proposing (what Boring calls) the theory 
of specific fiber energies: each nerve fiber within a sensory order was assumed to 
have a characteristic quality. But in his place theory of hearing, Helmholtz added 
another idea current at the time: that quality depends equally upon the place in 
the nervous system. Helmholtz focused attention upon places in the receptor, 
for example, by looking for “resonators” on the basilar membrane. Other 
investigators looked for places in the termination of sensory fibers, that is, in the 
brain. In 1870 Fritsch and Hitzig, working within a long tradition of brain 
localization of function, discovered by electrical stimulation the motor area of 
the precentral gyrus. Subsequent investigators tended to attribute sensory quali- 
ties to the central termination of the fibers on the basis of extensive stimulation 
studies. 

By the turn of the century Miiller’s doctrine had been abandoned, and it was 
assumed that sensory qualities resulted from a combination of localizations— 
both the place in the receptor from which the impulse originated, and the 
cortical termination at which it arrived. Knowledge of the neural impulse, 
culminating in Bernstein’s theory that the impulse is a wave of electrical 
depolarization, precluded any specific qualities for individual nerves. But it did 
not take long for the combined localization views to be refuted. Lashley (1929) 
showed that the cortical terminations of sensory messages were not indispensible 
to discrimination, and proposed the thesis of equipotentiality in place of 
localization of function. Lashley’s mass action thesis admits localization, but 
holds that no particular localization is necessary for any particular cerebral 
functining. Since the proponents of localization had themselves shown that the 
receptors were not crucial (by producing sensory qualia by stimulating the 
cortex directly), there is not much left of the localization view, at least at the 
level of specific sensory qualities. 

But if sensory qualities are not due to the character of individual nerves and do 
not depend essentially upon either the receptors from which they originate or 
the central neural termini to which they transmit, to what can they be attrib- 
uted? By abandoning anatomical localization for functional localization in terms 
of relations among patterns of neural impulses, Hayek provided the only alterna- 
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tive that is compatible with known CNS functioning. The sensory order can only 
be understood as a system of connections of impulses within the active CNS. As 
Hayek noted, the nervous system is fundamentally an instrument of classifica- 
tion, and the sensory order receives the properties it possesses in virtue of the 
patterns of impulses that create it. To repeat Hayek’s point: 


The sensory order (or other mental) qualities are not in some manner originally attached 
to, or an original attribute of, the individual physiological impulses, but . . . the whole of 
these qualities is determined by the system of connexions by which the impulses can be 
transmitted. . . . It is thus the position of the individual impulse or group of impulses in 
the whole system of such connexions which gives it its distinctive quality [Hayek, 1952, 
p. 53]. 


To put the point another way, Hayek’s generic theory of mind is sufficiently 
rich that it encompasses the sensory order as a special case without in any way 
changing that generic theory. And as should be obvious, Hayek’s theory of mind 
is a motor theory: the mental realm exists only in virtue of the generation of 
impulses within the nervous system. This activity, which is inherently classifica- 
tory in nature, creates both the traditional ‘sensory’ and ‘motor’ aspects of 
behavior. Not only does the nervous system generate behavior, but it also 
generates the order of events within the organism which is mind. Since the mind 
is, in the last analysis, the pattern of events that make it up, it is obvious that all 
its attributes depend upon patterns. Pattern talk is equally as indispensible for 
the understanding of theories and conceptual frameworks (see Hanson, 1970) as 
it is for sensory experience, and for precisely the same reasons in both cases 
(Note 11). 


6 THE NATURE OF CONSCIOUSNESS 


Perception, as conscious experience, is one thing; but what of the nature of 
consciousness per se? What theoretical accounts of consciousness are available? 
So far as I am aware, there is only one hypothesis as to the nature of 
consciousness in the literature which deserves to be labeled ‘theoretical’: the 
proposal of Roger Sperry (1952, 1966, 1968, 1969, 1976). All other accounts 
are merely paraphrastic, defining consciousness in terms of attention and aware- 
ness, and self-consciousness in terms of self-awareness. 

In essence, Sperry proposes that consciousness is our knowledge by acquaint- 
ance of what we know by description as the motor production command system 
of cortical activity involved in thinking and attending. As such, consciousness is 
an emergent property of cerebral activity having to do with the functional 
organization of the active brain. Further, since we are aware of our own 
consciousness, it follows that the brain must be able to detect and to react to the 
pattern properties of its own activity. Self-consciousness simply is the functional 
brain’s detection of its patterns of activity. 
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Two points that are integral to this conception of consciousness are worthy of 
separate mention. The first concerns the role of consciousness in the causation 
of human affairs, the second locates consciousness within the brain. Both have 
profound effects upon classic positions on the mind—body problems. 


6.1 The Causal Efficacy of Consciousness 


Concomitant with this picture of consciousness is an admission of its causal 
efficacy. On this point, Sperry (1952, 1969, 1976), Eccles (1973), and Pribram 
(1971a) agree: consciousness plays a directive role in the flow of cerebral 
impulses, and thus is essential to voluntary action. Consciousness supervenes (as 
Sperry puts it) in the generation of behavior in virtue of its position at the top of 
the organizational hierarchy which is the human mind. Neural impulses “are 
simply carried along or shunted this way and that by the prevailing overall 
dynamics of the whole active process [Sperry, 1969, p. 534]” (Note 12). 

This puts experience, our knowledge by acquaintance, right back in the heart 
of the problems facing scientific psychology. As Sperry notes: 


Perceived colors and sounds, etc., exist within the brain not as epiphenomena, but as 
real properties of the brain process. When the brain adjusts to these perceived colors and 
sounds, the adjustment is made not merely to an array of neural excitations correlated 
with the colors and sounds but rather to the colors and sounds themselves [Sperry, 


1969, pp. 535-536]. 


Thus epistemically the only tenable mind-body position is an emergent inter- 
actionism (see Weimer, 1976). 


6.2 The Localization of Consciousness 


Although a slight digression from the motor theory of consciousness, it is worth 
remarking that Sperry’s work with commissurotomized patients has enabled an 
anatomic localization of consciousness within the brain undreamed of even a 
decade ago. The remarkable finding that results from callosal sectioning is that 
all the actions programmed from the minor hemisphere (98% of the time the 
right hemisphere) are not recognized by the conscious subject as being actions of 
his own initiation. Conscious awareness of the initiation of actions is restricted 
entirely to the dominant, speech-localized hemisphere. Actions initiated by the 
right or minor hemisphere are unconscious, or tacit. Voluntary awareness there- 
fore resides in the major hemisphere, presumably in the anatomically specialized 
areas for speech comprehension—largely Wernike’s area (Geschwind, 1972). Only 
the dominant hemisphere has access to language, and when the minor hemi- 
sphere is cut off from it, the conscious being, the self, can no longer verbalize 
the proceedings of the minor hemisphere. Sectioning of the callosum reveals 
what has always been the case with consciousness: that the minor hemisphere is 
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unconscious, and that linkage through the corpus callosum is necessary for it to 
either give or receive information from the conscious “self” (Note 13). 

Two points need to be mentioned in this regard. First, since the areas of the 
major hemisphere that are implicated as underlying consciousness are concerned 
with speech perception, the motor theories of these two domains become linked, 
The motor theory of consciousness now becomes prima facie as plausible as the 
motor theory of speech perception, and vice versa. Second, certain simplistic 
mind-body positions are empirically refuted (Eccles, 1973). Specifically, the 
classic neural identity theory as propounded by Feigl (1967) is refuted, for it is 
not the case that the conscious mind is coextensive with the entire brain, as that 
identity theory holds. Further, double knowledge theories that assume we know 
all the brain by description and experience by acquaintance are wrong, because 
we know by acquaintance only a tiny part of the activity of the brain. The 
problem for the mind-body relationship is to explain why and how only this 
particular part of the brain is involved in conscious experience and “selfhood.” 
Tacit knowledge is a severe problem for the various forms of the identity theory 
yet proposed (Weimer, 1976). 


7 THE PROBLEM OF MOTIVATION 


Classically, the problems of motivation center around the activation, energiza- 
tion, and direction of behavior. In sensory conceptions of mind, motivation is 
either extrinsic to the organism or at best concerned with its information-pro- 
cessing capabilities; that is, motivation is external to the CNS, arising from such 
classic sources as strong and painful external stimuli, internal bodily and homeo- 
Static states such as hunger and thirst, and sex (if sex is not considered in the 
previous category). Drive theory was the classic approach to motivation in 
stimulus-response psychology, and Dollard and Miller’s (1950) statement that 
drives are “strong stimuli which impel action [p. 30] received universal assent: 
the quarrels concerned whether such stimuli arose from physiological need 
states, whether their reduction was reinforcing, how they led to choice of goals, 
and affected learning, and so on. In such passive conceptions of mind, a 
motivational variable was required to arouse the organism from a quiescent state, 
to make it behave ab initio. Indeed, Woodworth (1918) introduced the concept 
of drive into psychology from the analogy of the drive train in an automobile, 
without which a car cannot move or “behave.” This conception of the nature 
and role of motivation was sufficiently strong to lead some theorists to account 
for evidence refuting it (such as observations of exploration, play, curiosity, 
etc., even in rats) by postulating a “boredom drive” whose reduction, and 
hence reinforcement, was attainment of a state of stimulation after a lack 
of input. Stimulus deprivation thus became the ultimate drive in the ab- 
sence of other identifiable sources, and the passive organism could be goaded 
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into action by either the presence or absence of stimulation (see, e.g., Fowler, 
1965). 

From a stimulus-response-reinforcement point of view, no other way of 
treating motivation seemed feasible. But if that framework, and the passive 
conception of the organism, are abandoned, the problem of motivation is 
entirely different. Indeed, part of the problem for cognitive theory is that the 
old problems of motivation just don’t arise when considering the higher mental 
processes. Proliferating new drives, such as a drive to communicate, or to reduce 
semantic “dials” in a “cognitive mixer” (Osgood, 1971) to “explain” why we 
speak, and so on, is simply absurd. To say that any classic source of motivation 
underlies thought, reasoning, imagery, creativity, perceiving, testing a scientific 
theory, and so on, no longer makes sense. But the problems of the domain of 
motivation remain: How can we understand what, for example, motivates our 
quest for knowledge? 

Without denying the utility of a restricted, nondispositional theory of extrinsic 
motivation, especially for intense ‘stimulus’ inputs, it seems clear that such 
accounts must be supplemented by a theory of the motivation intrinsic to the 
higher mental processes. What is required is a mechanism of motivation inherent 
in the active CNS, dealing with information processing and action. Psychology 
must unpack the promissory note found in “acknowledging an intrinsic motiva- 
tion or a motivation inherent in the organism’s informational interaction with 
circumstances through the distance receptors and in its intentional, goal-antici- 
pated actions [Hunt, 1965, p. 197].” We must acknowledge that motivation is 
intrinsic to the operation of the CNS, and that the extrinsic sources of motiva- 
tion that can be retained from classic motivation theory have their ‘source’ in 
the intrinsic motivation of nervous systems; that is, cognitive psychology must 
develop a “motor theory” of motivation. 

But no such theory exists, although there are faint hints at what is involved 
(Hunt, 1965; Pribram, 1971a). In the meantime the field of motivation is 
changing in two directions at once. On one hand, there is a proliferation of “new 
looks” toward old problems (growth motivation, personal causation à la 
DeCharms, 1968, etc.), attempting to replace the drive theory framework 
without entirely abandoning what the sensory conception construes as motiva- 
tion. On the other hand, cognitive psychology is dealing with motivational 
problems and issues (such as the issues stated in the title of this volume), but not 
developing a specific theory of motivation. This second approach uses motiva- 
tion without mentioning it, and treats the problems of motivation in the process 
of addressing other issues. Implicit in this approach is the idea that an adequate 
motivational theory will be a by-product of a completed cognitive psychology, 
but that motivation, as a separate domain, may not have any counterpart in the 
cognitive framework. If this is indeed the choice we face, between trying to 
translate the old motivational system into the cognitive paradigm and abandon- 
ing the field as a separate entity, it is clear which way to go. 
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8 THE MANIFESTATION AND REPRESENTATION OF MEANING 


The focal problem of philosophy and psychology is meaning: both disciplines 
have as an essential task the explanation of its nature and functioning. This 
section overviews the nature of major problems and their relation to the motor 
theory framework. Section 8.1 proposes an identity of the problems of meaning 
and tacit knowledge; Section 8.2 proposes that the motor involvement in the 
manifestations of meaning be modeled upon craftsmanship; Section 8.3 explores 
a conceptual framework for understanding the context relativity of comprehen- 
sion; and Section 8.4 considers holography as the neurological “motor” under- 
lying the representation of meaningful information. 


8.1 Meaning as the Problem Underlying Tacit Knowledge 


I have made this argument in a number of locations; a succinct presentation 
occurs in Weimer (1974), which I can do little better than repeat: 


I would like to argue that the entire problem of tacit knowledge is nothing more, 
nothing less, than the problem of meaning. In this sense, there is only one problem that 
has ever existed in psychology, and everything the field has investigated is merely a 
manifestation of that problem, a different aspect of the same elephant, an elephant that 
we have grasped at since the dawn of reflective thought without ever reaching at all. 


This problem has many names. In the language of behaviorism, it is a matter of 
stimulus generalization or of stimulus equivalence. In the terminology of Gestalt 
psychology, it is the problem of contact between perceptual processes and memory 
trace: the so-called “Höffding step.” Among philosophers, the question is usually 
formulated in terms of “universals” and of “abstraction from particulars.” For 
Bruner and his associates, it is the problem of categorization. In computer tech- 
nology, it is called “character recognition” when only letters and numbers are to be 
identified, or more generally “pattern recognition” [Neisser, 1967, p. 47]. 


The problem, “When are stimuli equivalent?”, is the problem of stimulus recognition, 
which is the problem of concept formation, ad infinitum, all of which together con- 
stitute the problems of meaning. Stimuli are equivalent, in the final analysis, only 
because they mean the same thing. No matter where one goes in psychology there comes 
a point at which one runs straight into an insurmountable wall that is, conceptually 
speaking, infinitely high and wide. All we can do is look up and see that written on that 
wall are all the problems of the manifestations of meaning [Weimer, 1974, pp. 427-429] . 


8.2 Craftsmanship and Skill in the Manifestation of Meaning 


The motor theory of mind yields a very informative analogy or model of the 
manifestation of meaning in all cognition and action: if meaning is action, then 
its exhibition is a matter of craftsmanship. The motor involvement (recall 
Section 2) in the creation of meaning is an instance of skill, and the most fruitful 
metaphor for the understanding of skill is still Plato’s conception of the soul as a 
craftsman, or artificer, who constructs the entire realm of human participation 
in the universe. The motor theory asserts that the manifestation of meaning is a 
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product of the skill of the CNS, and thus that the way to study meaning is to 
study the ways in which the CNS is skilled. Craftsmanship thus becomes the root 
metaphor of mind, instanced in every ‘act’ of either the reception of information 
or the production of behavior. Meaning becomes the underlying interrelationship 
of all ‘higher’ mental processes, manifested in each such process but not 
identifiable with any one of them. Thus meaning is always context dependent: 
dependent upon the framework that relates all ‘higher’ mental processes (rather 
than upon any one such process). 


8.3 The Contractual Nature of Understanding 


No long ago it was thought that an adequate psychology could be context 
free—that the rules of determination for the assimilation of information and the 
production of action could be written in a grammar that utilized no context- 
dependent constraints. The abandonment of this program in linguistics has now 
ramified throughout psychology and one of the most prevalent effects of the 
‘cognitive revolution’ has been the realization of the necessity for context- 
sensitive rules for both comprehension and production. The phenomenon of 
ambiguity, especially in its deep structural form, has been perhaps the most 
significant substantive importation into psychology from transformational lin- 
guistics. 

Comprehension, the ability to render incoming information meaningful, is 
basically the skill of resolving ambiguity. Most research in comprehension that 
has occasioned recent interest, such as that of Bransford and his collaborators, 
consists in little more than demonstrations of the now “obvious” fact that if a 
subject is given an appropriate context he can understand virtually any input 
whatsoever (or conversely, that seemingly easily understandable input can be 
rendered incomprehensible by supplying the “wrong” context). Thus “context” 
is now the key word in the experimental study of comprehension (e.g., see 
Jenkins’, 1974, overenthusiastic extension of Pepper’s, 1942, metaphor of 
contextualism). 

However, context per se may or may not b 
comprehension: there are higher order constraints that determine when a given 
context will or will not produce certain effects. Following Proffitt (1976) I call 
these higher order constraints contracts. The determining factor in comprehen- 
sion that specifies when contextual sensitivity will or will not be evidenced then 
becomes contractual obligation. Contractual obligation is a matter of the sub- 
ject’s coming to treat incoming information as bound up within the terms of a 
contract, a contract that specifies what contexts are appropriate to use for the 
assimilation of subsequent information. 

The theatrical metaphor is a good vehicle for understanding the nature of a 
contract. How one interprets, say, a play depends upon the manner in which one 
comes to treat the happenings upon a stage. We are perfectly willing to admit 


e the determining factor in 
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that, say, a poorly hung curtain is “really” a wall between two rooms in the 
context of a play, but moments later that same curtain is just a stage prop. Or 
consider the performance of a good magician (one whose “magic” is concerned 
with the wonder of the world, rather than a clever, but always ego-involved, 
perpetrator of mere tricks). His performance induces awe, amazement, profound 
respect, and in a very important sense, belief; but not so a carnival “trickster.” 
Are such effects just due.to context? Certainly not. The context remains the 
same (or can be identical), but the contract one employs to interpret the context 
changes. One participates in the play or magic show in a very different way from 
what one does in just being in a theater or sitting in an audience. It is as if there 
is a prior agreement between the playwright (or magician) and ourselves that we 
will accept subsequent information only within certain contexts and not in 
others. It is as if we are contractually obligated to utilize certain contexts and to 
ignore others that are, a priori, equally capable of being imposed upon the events 
in question. We present incoming information to ourselves through contracts 
that define the way events ought to be, and hence establish given contexts at the 
expense of others. 

The concept of contractual obligation allows a novel interpretation of the 
phenomenon of learning: instead of being a “meaningless” matter of reinforced 
practice, as standard textbooks assert, it is a matter of contractual change. What 
is learned, that is, that which allows the organism to respond appropriately in 
novel contexts, is a changed contract. Learning is thus a restructuring of the 
meanings of events through contractual change. Therefore when contracts and 
contexts compete, it is the contract that determines what one will see (Proffitt, 
1976). We learn to render the world meaningful in terms of contracts which 
specify permissible contexts, not in terms of contexts alone. 


8.4 The Representation of Meaning 


The old problem of storage, of how the nervous system makes available represen- 
tations of input, has largely given way to a new problem which concerns what it 
takes to recreate a perspective. The problem of memory (in all cases except sheer 
rote learning) no longer revolves around the storage of input, but now concerns 
the utilization of meaningful information. But where is that information if not 
stored away within the nervous system? How can one utilize information that 
isn’t always there? 

Until recently, no one could have answered this question. But with the 
development of optical holography, we at least have a model of how the nervous 
system could recreate a perspective without utilizing any traditional concept of 
‘storage.’ Transferring the hologram from its initial optical conception into 
neurological functioning requires acknowledgment that the nervous system does 
not function solely by discrete neural impulses. Instead one must take account, 
as Pribram (1971a) has constantly emphasized, of the role of the junctional 
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microstructure and the continuously graded slow potential activities of neuronal 
aggregates. When acknowledgment is made of this second mode of activity of 
neural functioning, the way is clear for conceptual and empirical advances of the 
order of magnitude of those made when the reflex arc conception was aban- 
doned. We can now understand how the brain could, as a functional neural 
hologram, make available to itself virtually immediately a continuous series of 
perspectives from which to render perceptions intelligible and to serve as the 
basis for actions. The representation of meaning can thus be modeled upon the 
holographic metaphor, and this “motor” for the generation of meaning by 
‘memory’ clears up conceptual confusions that simply debilitated the sensory 
metatheory. 


9 SUMMARY 


Traditional identifications of sensory and motor, from the neural level of 
analysis to the behavioral, have been based upon the “passive” reception of 
incoming information and the muscle involvement in behavior, respectively. As 
should be obvious by now, both these interpretations, couched as they are in 
purely surface structure terms, must be abandoned in favor of a deep structural 
account of (broadly speaking) perceiving and acting. When one penetrates to 
the deep conceptual level underlying these disparate surface phenomena it 
becomes clear that the abstract structure underlying them, although common to 
both, cannot be identified with either. Thus the question of how to interpret 
‘the mind’ (construed as that abstract, underlying structure) arises. The sensory 
metatheory of mind arose as an initially plausible attempt to make ‘the mental’ 
intrinsically sensory, by the simple expedient of stretching the surface meaning 
of sensory and abandoning action (to the status of a consequence of mental 
processing). This chapter has argued instead for a motor metatheory of 
mind which, abandoning both traditional interpretations of sensory and mo- 
tor, attempts to characterize the communality of acting, perceiving, and 
comprehending. 

Any viable psychological theory must encompass the fundamental fact of 
productivity or creativity in all perception, cognition, language, and action. It is 
the productive nature of all organismic activity which simultaneously rules out 
any as yet proposed alternative to the motor metatheory and which also 
provides its strongest support. The revolution in cognitive psychology, to the 
extent that it has as yet occurred, consists mainly in recognizing the pervasive 
presence of problems of productivity. This chapter attempts to aid that revolu- 
tion by making the problems explicit in domains that have not as yet received 
sufficient examination, and by proposing the extension to all domains of ‚the 
class of theory most capable of encompassing productivity. The result is a 
conceptual framework which, for the first time, allows both the incorporation of 
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the basic insights of the transformational revolution in linguistics into psychol- 
ogy and also unites the information-processing capabilities of the psychological 
subject with his potentialities for action. 


CHAPTER NOTES 


Note 1. It must be emphasized that this chapter argues against a sensory 
conception of mind and in favor of a motor conception, but that it makes this 
metatheoretical argument by discussing theories that fit these conceptual frame- 
works, Thus “the sensory theory” and “the motor theory” are (possibly 
misleading) short-hand formulations employed to save space and function as 
labels, There is indeed no general motor theory of mind, nor any general sensory 
theory; there are only substantive theoretical formulations which stem from one 
or the other metatheoretical framework. 

Note 2. Such an approach lands Neisser with strange bedfellows indeed. One 
can hardly imagine a more characteristic constructive slogan than “seeing does 
not imply something seen,” yet this is a battle cry not only of the traditional 
cognitive theorist but also of the most extreme radical behaviorist: 


A man need not copy the stimulating environment in order to perceive it, and he need 
not make a copy in order to perceive it in the future. When an organism exposed to a set 
of contingencies of reinforcement is modified by them and as a result behaves in a 
different way in the future, we do not need to say that it stores the contingencies. 
What is “stored” is a modified organism, not a record of the modifying variables 
[Skinner, 1969, p. 274]. 


Skinner (1974) has even contradicted his earlier empty organism, passive concep- 
tion of man by admitting the controlling influence of covert events (e.g., p. 223) 
in a formulation that comes very close to being “mediational.” The point to 
remember, however, is that Skinner cannot dismiss all of cognitive psychology 
on this point, but only the “computer modelers” and “information processors” 
(such as Shiffrin in the subsequent quotation in the text). And unfortu- 
nately for Skinner, he is also quite inconsistent: when it comes to his “ex- 
planation” (which is to say, paraphrase) of rule-governed behavior, Skinner relies 
very explicitly on the storage of symbolic contingencies (see Mahoney, 1974, 
especially Chapter 9). 

All of this serves to emphasize a crucial point: seemingly disparate theoretical 
formulations, especially when the theorists involved want to separate themselves 
from the opposition, may nonetheless be identical or very similar on crucial 
points because they are variants of the same metatheory. Since Skinner and 
Neisser are both “sensory” theorists, it is not at all surprising that essentail 
aspects of their positions overlap. 

Note 3. The cerebellum has grown commensurately with the neocortex 
throughout vertebrate evolution, and the immense superiority of man’s skills is 
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due to this coordination of development. Man’s superiority is a matter of his 
skills, both motor skills in the traditional sense and such “higher” skills as 
language, thought, and culture. By performing an enormous number of computa- 
tions, the cerebellum acts as a comparative computer to control skilled move- 
ments. There is immediate input to the cerebellum from the motor cortex, and 
the cerebrum cannot initiate any action without informing the cerebellum. J. C. 
Eccles has exhaustively investigated the cerebellum in the last decade, and this is 
his summary statement of its functioning: 


What you do with ordinary movements is to give a general command—such as “place 
finger on nose,” or “write signature,” or “pick up glass”—and the whole motor 
performance goes automatically. For example, you don’t have to spell out your name 
letter by letter when you're writing your signature—if you did, the bank manager would 
not recognize it! You just give the general command from the cerebrum and let the 
cerebellum take over in order to give the fine characteristic details. 

Normally our most complex muscle movements are carried out subconsciously and 
with consummate skill. The more subconscious you are in a golf stroke, the better it is, 
and the same with tennis, skiing, skating, or any other skill. In all these performances we 
do not have any appreciation of the complexity of muscle contractions and joint 
movements. All that we are voluntarily conscious of is a general directive given by what 
we may call our voluntary system. All the finesse and skill seems naturally and 
automatically to flow from that. It is my thesis that the cerebellum is concerned in all 
this enormously complex organization and control of movement, and that throughout 
life, particularly in the earlier years, we are engaged in an incessant teaching program for 
the cerebellum. As a consequence, it can carry out all of these remarkable tasks that we 
set it to do in the whole repertoire of our skilled movements in games, in techniques, in 
musical performance, in speech, dance, song, and so on [Eccles, 1973, pp. 122-123]. 


Man can perform skilled behaviors such as threading a needle or building a 
miniaturized transistor circuit that are far beyond the control limits of move- 
ment of even the highest primates, and his ability to perform such delicate 
operations is due to the interaction of the cerebellum with the cerebrum. The 
exhibition of intelligence in skilled performance could not take place without 
the coalitional control of the cerebellum (see Note 13). 

Note 4. Historical alternatives to sensory theories of mind extend at least as far 
back as Plato, for the Timaeus develops a prototypic motor theory of mind. Asa 
sidelight to developing Plato’s cosmology and ethics, the Timaeus develops a 
conception of the World Soul as an artificer or craftsman who shapes the 
universe according to divine plans or templates. This Divine Craftsman not only 
shapes and animates the cosmos, he is also the model for living things such as 
man. The essence of man is thus craftsmanship, and our creativity and produc- 
tivity is due to this generative capacity that we share with the divine artificer. 
Regardless of how fanciful the account seems, it is clear that Plato had an active 
epistemology and a motor theory of mind that made craftsmanship and creativ- 
ity the prime metaphors for understanding the uniqueness of man (his soul). 

Plato also proposed more specific motoric theories of the higher mental 


processes. Not long ago the writer listened to an introductory lecture on 
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perception which pointed out (to the instructor’s wry amusement and the 
students’ utter bewilderment) that Plato had been so “unscientific” as to 
propose that vision involved the eye “spitting out” particles which struck the 
objects seen. This sensory theorist then pointed out how the first True Scientist 
(Aristotle) had proposed today’s common-sense view that vision involves the eye 
passively registering information. Although ridiculed by sensory theorists then 
and now, Plato’s notion embodies a motor theory of vision quite consonant with 
current views (such as those outlined in Section 4 below). 

Plato’s alternatives to Aristotelian conceptions were virtually ignored through- 
out history, although the active epistemologists such as Leibniz and Kant kept alive 
some of his views. I have suggested that the Platonic epistemology is far more 
tenable than its Aristotelian alternative (Weimer, 1973), and this entire chapter 
argues as forcefully as possible that Plato’s motor theory, at least in its emphasis, 
is the only viable conception that we have. 

Note 5. Pribram locates signification in pathways related to so-called primary 
sensory systems. Signs are thus constructed when actions operate upon percep- 
tual images: “Discrimination learning, pattern recognition, selective attention, all 
involve neural choice mechanisms, choices that beget actions which in turn 
modify what is Imaged [Pribram, 1971a, p. 312].” Anatomically these functions 
seem to,involve structures posterior to the central sulcus, such as the prestriate 
cortex and portions of the temporal lobe. 

The connections important to symbolic processes seem to involve frontal lobes 
and certain limbic structures. Pribram argues that the limbic system is ana- 
tomically suited to symbolic behaviors because context sensitivity requires that 
“a set of contexts must become internalized (i.e., become brain states) before 
the appropriate response can be made. Building sets of contexts depends on a 
memory mechanism that embodies self-referral, rehearsal, or, technically speak- 
ing, the operation of sets of recursive functions. ... The closed loop connectiv- 
ity of the limbic systems has always been its anatomical hallmark and makes an 
ideal candidate as a mechanism for context dependency [Pribram, 1971b, pp. 
24-15].” 

This corticofugal conception can be compared with the classic cortico-cortico 
relay conception by studying Fig. 3. 

Note 6. The motor theory is not the muscle theory. A very telling joke about 
John B. Watson’s rabid antimentalism states that Watson made up his windpipe 
that he didn’t have a mind. Thinking, for Watson, was merely the activity of the 
peripheral musculature involved in speech. With this peripheralism and “muscu- 
lar” emphasis Watson had a muscle theory of the mind, but not a motor theory 
as we are using the term. Watson’s mind was passivist, empiricist, and associa- 
tionist, and his nervous system was a switchboard with discrete sensory and 
motor components, In contrast a motor theory requires an active epistemology, 
postulation of rich innate structuring, emphasizes the inherently ‘motor’ nature 
of the sensory—perceptual systems, and is incompatible with associationism as 
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Motor cortex 


Visual 


(b) 
Inferotemporal cortex 
and Pribram’s proposed (b) mechanism of cortical 


involvement in perception and cognition. Classic mechanism involved cortico-cortico connec- 
tions from “primary” sensory areas to frontal association areas to motor cortex. Pribram’s 
proposal involves cortico-subcortico connections of “association” areas with basal ganglia 
and other subcortical motor areas. (From Pribram, 1971a, p. 368.) 


FIG.3 Comparison of the classical (a) 


the only principle of mental functioning. But most importantly, the motor 
metatheory is a theory about the central nervous system only, and it makes no 
identification of higher mental functioning with either the peripheral nervous 
system or behavioral responding. To do the latter, as Watson did in identifying 
thinking with the activity of the larynx, is to commit the operationalist’s fallacy 
of confusing the evidence with that which is evidenced. 

Note 7. Thinking is not unique among human skills in its hierarchical organiza- 
tion and productivity. Students of motor skills have long recognized the com- 
monality between language and skills, and the inadequacy of stimulus-response 
models to account for the complexity of skilled behavior. For example, Fitts 
(1964) defined skilled responses as those in which receptor-effector-feedback 
processes are both spatially and temporally highly organized, and took the goal 
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of skills-learning study to be the explanation of how such organization or 
patterning comes about. He argued this way well before the revolution in 
psycholinguistics had made itself felt to the remainder of psychology: “Rather 
than viewing perceptual-motor behavior as a series of motor responses made to 
reach some goal, it is possible, and I believe considerably more profitable, to 
view such behavior as an information processing activity guided by some general 
plan or program [Fitts, 1964, p. 248].” Later, speaking of learning as an 
adaptive process, Fitts noted that “Basic to the adaptive system is the existence 
of hierarchical processes [p. 250].” He noted that as long ago as 1899 R. S. 
Woodworth had concluded polyphasic (two-phase) motor units (which he took 
to be basic components of skills) were patterned in advance of their initiation, 
and remain uninfluenced by sensory inputs during their execution. Woodworth 
also argued that rapid polyphasic reponses, such as tapping successive targets, 
involve the preformation and emission as a unit of more than one polyphasic 
unit. 

Since this high degree of sophistication and awareness of the issues has 
characterized the area of skills learning for at least three quarters of a century, 
one may wonder why this area did not lead the revolt against behaviorism. Two 
factors are especially relevant: First, the area has always been viewed for its 
practical and applied utility rather than its theoretical import; second, the 
“theories” that were developed have been paraphrastic redescriptions of the data 
to be explained. Woodworth’s descriptive conclusions, for example, are as 
correct today as 75 years ago, but descriptions are not explanations. It remained 
for other theorists (e.g., Bartlett, Chomsky) to look to language for theoretical 
concepts (deep structure, schemata, etc.) which, with present-day hindsight, the 
motor skills data were just waiting to suggest. 

But regardless, the point remains: thinking is representative of all skills in its 
coalitional organization and patterning, and its productivity. And in this regard, 
skill learning is part of the domain of cognitive psychology. 

Note 8. Piaget remains a biologist looking at psychological development: 
development and evolution within the individual are intrinsic to biological 
adaptation. Adaptation is always an organism-environment interaction, and 
adaptation so construed is the biological basis of knowledge. Piaget is therefore a 
biological constructive theorist. As Furth (1969) puts it, “knowledge is neither 
solely in the subject, nor in a supposedly independent object, but is constructed 
by the subject as an indissociable subject—object relation [p. 19].” Since the 
term “knowledge” in traditional usage implies static or fixed objects that are 
stored away by the organism, Piaget rarely speaks of knowledge, but rather of 
the processes of adaptation and the exhibition of intelligence: 

Piaget . . . employs the assimilation-accommodation paradigm throughout all the mani- 


festations of knowing activities. With increasing adaptation in mental development the 
undifferentiated matrix of organism-in-environment becomes structured into organism- 
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knows-environment. The structure that is thus built up is intelligence, considered as the 
general framework within which we know the world [Furth, 1969, p. 181]. 


This means that sensation or stimulation alone cannot constitute knowledge. 
Further, unless there is a structure within the organism (a scheme or schemata) 
prepared to assimilate or accommodate to stimulation, it will not even lead to 
knowledge. Compare this to Shiffrin’s position quoted in Section 1.1. 

Note 9. The clash between Gibsonian and constructivist accounts of percep- 
tion is well brought out in several contributions and discussion sections in 
Weimer and Palermo (1974). By concentrating on the question of what is 
perceived, and emphasizing the modulation of information present in the stim- 
ulus array, the Gibsonian need not speak of constructive processes when dealing 
with the elementary aspects of perception of the properties of the visual array. 
The information specifying surfaces, texture gradients, presumably even depth, 
is given directly in the stimulus array. In contrast, information specifying the 
meaning of speech sounds is not directly present in the acoustic signal, and 
requires processing of an apparently constructive sort. Thus Turvey (1974) 
postulates two modes of perception, one of which is constructive and restricted 
to linguistic materials, the other tacit and nonconstructive. 

But against this view it can be maintained that all perception is constructed 
because what we experience in perception, our knowledge by acquaintance, is 
not a reflection of the objects that cause our experience, but rather of our 
processing structure (see Section 5). One can argue with Gibson that the 
information is directly there in the stimulus array, but it is clear that our 
experience is not. All perceptual experience must be a construction, even of the 
elementary properties such as surface, texture, and depth. Insofar as perception 
involves meaning, there must be construction and inference. It will not do to say 
that we perceive the way a radio receiver merely modulates, or transduces, the 
information present in radio waves. A radio doesn’t perceive the information it 
transduces. We, as percipient beings, do more than modulate that information, 
we perceive it and (try to) understand it. If the radio could perceive, it would do 
more than modulate. That “more” would involve constructing a perceptual 
experience. 

Note 10. The contention that all the attributes of sensory and other mental 
qualities are relations to other qualities, and that the totality of such relations 
exhausts what we can know by description of the mind, does not deny the 
occurrence of knowledge by acquaintance. A congenitally deaf scientist who 
studies music will never know its experienced qualities, to be sure; but “What we 
are denying is not that sensory qualities may possess attributes which those who 
cannot hear cannot learn about, but that whatever incommunicable attributes 
sensory qualities may possess can even raise a scientific problem (Hayek, 1952, 
p. 34].” Likewise, phenomenal experience is always richer than our knowledge 
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by description can capture: “The congenitally blind or deaf can never learn all 
that which the seeing or hearing person owes to the direct experience of the 
sensory qualities in question, because no description can exhaust all the distinc- 
tions which are experienced [Hayek, 1952. p. 35] .” 

Note 11. It is instructive in this regard to compare Hanson’s treatment of 
patterns in the epistemic context of understanding the nature of theories with 
Bransford’s treatment of patterns in comprehension. In both instances, what is 
emphasized is that seeing entities as patterned, as conceptually specified gestalts, 
occasions understanding. In the philosophical context that understanding is 
“theoretical explanation,” while Bransford refers to it somewhat more generi- 
cally as comprehension. Hanson (1970) states the following: 


That patterns affect the significance of lines, dots, shapes, and patches—which might 
have been in perceptual turbulance otherwise—is our fundamental datum. It has pro- 
found epistemological consequences. Knowledge is a function of how our experiences 
cohere. Observations made before the perceptual pattern is appreciated are epistemically 
distinct from the observations (and their descriptions) made after that pattern has cast 
them into intelligible constellations. ... Just as perceptual pattern recognition at once 
gives significance to elements perceived and yet differs from any perception of dots, 
shapes, and lines, so also conceptual pattern recognition at once gives significance to the 
observational elements within a theory and yet differs from any awareness of those 
elements vis-à-vis their primary relationship to events and objects [pp. 236-237]. 


As Bransford and McCarrell (1974) put it: “We have argued that one’s knowl- 
edge of his environment is considerably richer than knowledge of the perceptual 
characteristics of isolated objects, and that perceptually derived knowledge 
entails knowledge of relations rather than things [p. 200].” 

Both positions are extensions to different levels of analysis of the fundamental 
point that Hayek made concerning the organization of the nervous system. 

Note 12. The problem plaguing a theory that attributes causal efficacy to 
consciousness is to explain how part of the system of neural impulses could 
control the occurrence of other neural impulses. Since there is no difference 
among the impulses per se, it is hard to see how organizational direction could 
come from some and not others. Neither active firing nor active inhibition seems 
adequate, for these processes are too tied to anatomic localization to deal with a 
functionally specified system. How then can we explain Sperry’s “prevailing 
overall dynamics of the whole active process” as the mechanism of causal 
control? 

One possible answer is provided by the concept of mutual entrainment (see 
Dewan, 1969, 1976). As an empirical phenomenon, mutual entrainment refers 
to the emergence of regulation in a system composed of a large number of 
subsystems, due to the mutual interactive effects of those subsystems. For 
example, AC generators in the United States (except Texas) are entrained. They 
all lock in step and stabilize their output by virtue of being connected to a 
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common transmission system. This self-organization in which a system of oscilla- 
tors functions as a single unit is easily explained: 


If one generator leads the others in phase, i.e., if it is slightly faster, then its energy will be 
absorbed, not only by the load, but also by all generators which lag behind it. This will 
increase the load on the generator, forcing it to slow down a bit so that it won’t “get out 
of step.” If by chance it lags in phase, the other generators pump energy into it so that it 
catches up. Thus, generators which go a bit too fast are slowed down while those that 
lag are speeded up. They pull together in frequency ... The stability and accuracy of a 
system of generators is far greater than any single unit. This mutual entrainment is a 
splendid example of self-organization, . . . an emergent property of the entire system 
which goes far beyond what any single unit can accomplish in accuracy and power 
[Dewan, 1976, p. 185]. 


Consciousness could, via mutual entrainment of neural impulses, thus exert a 
controlling influence. What we require now is a theory of generalized control of 
mutually entrained systems to understand how consciousness could act, and to 
derive testable predictions about changes in output with changes in the relevant 
entrainment relations. This will lead to theories of adaptive control systems 
providing additional, self-regulating control for more basic feedback systems. 
Consciousness, as a coping system, would be such an adaptive or perhaps a 
superadaptive control system (Dewan, 1976). It would also, as argued in Note 
13, be a coalitional control system. 

Note 13. Sperry’s contention that each hemisphere is autonomously conscious 
is prima facie support for a position developed by Robert Shaw to deal with the 
enormous complexity of the organized system that is the totality of human 
higher mental processes. Shaw argues that the concept of hierarchical control 
systems, although an enormous improvement over linear or chaining control 
systems, is still incapable of dealing with the higher mental processes as a whole. 
When attention is confined to any single process, such as language, perception, 
or skills, a hierarchical control model may be adequate to model the system. But 
a hierarchical model cannot account for the autonomy of functioning that all 
the higher mental processes exhibit within a single person. When dealing with the 
person as a whole, there is no single locus of control that is always in command 
of the entire system, as is required by the logic of hierarchies. Instead, the 
person as a whole seems to be a coalition of hierarchical structures, somehow 
allied together but with no single locus of control. There is cooperation, but not 
determinate control, among the various mental systems. There is also a lack of 
determinate specification of boundaries between systems. Clearly perception is 
not skilled locomotion, yet one cannot separate the systems in any determinate 
manner—the boundaries are intrinsically “fuzzy.” Further, persons are super- 
additively complex entities: they are more than just the sum of the systems 
making them up, which is another property of coalitional structures (see Shaw & 
McIntyre, 1974; Von Foerster, 1962). 
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One can see the coalitional nature of the mind by considering the effects of 
removing or suppressing a higher mode in hierarchical control systems. Were this 
to occur in a hierarchy, functioning would be disrupted, and the overall organi- 
zation of the system would deteriorate. But precisely this does not occur in a 
coalitional structure: functioning may change, but it is not disrupted or disinte- 
grated. Sperry’s account of the relationship between the major and minor 
hemispheres indicates precisely the decentralization of control, fuzziness of 
boundaries, resistance to disruption, and so on, that is characteristic of coali- 
tions. But the concept of coalitional structure allows us to understand both the 
unity of the person (as a singular subject of conceptual activity, as the “self,” 
etc.) and also the welter of diverse systems that “go together” to make him up. 
The person ‘goes together’ exactly as a coalitional structure “goes together.” 
Such an account reconciles ‘unity in diversity’ far better than a hierarchical 
model can allow, and makes explicit the persistent historical theme (familiar in 
the esoteric traditions such as Yoga) that the higher mental processes are ‘tools’ 
of the ‘I’ or self. Indeed, if the mind were a hierarchy, it would be nonsense to 
say that the higher mental processes were its tools—for how could something be 
a tool of itself? But if the mental processes are understood as hierarchical 
systems within a superadditive coalitional structure, then not only Sperry’s 
observations, but clinical observations of ‘‘divided selves,” and the esoteric 
conception of the mind as a tool of the ‘I,’ make sense. 
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Speech perception is not ordinarily included in the body of phenomena and 
theory that convention defines as the psychology of perception. Yet the prob- 
lem of how the perceptual categories of speech are specified in the acoustic 
signal is a primary example of the problem of perceptual constancy. In spite of 
its neglect by psychology, speech and its signal have been intensively studied by 
members of other disciplines. We think that some of the results and puzzles 
generated by this research are relevant to the concerns of our colleagues whose 
primary interests are in other facets of human cognition. 

Speech perception at any level involves classification. The classificatory step is 
assumed whenever we move beyond a purely physical (acoustic) description of 
speech to a psychophysical description in terms of perceptual units. Unlike 
certain problems in traditional psychophysics in which the choice of units may 
be arbitrary, there is wide consensus about what the units of perception are in 
the case of speech. This consensus is the product of centuries of linguistic 
investigation, during which many attempts have been made to isolate the various 
levels and units that constitute our perception of speech. Viewed in terms of 
structure, speech is a hierarchical system that manifests what Hockett (1958) 
called a “duality of patterning”: it employs both meaningful and meaningless 


1 Now at the University of Connecticut and Haskins Laboratories. 
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units. Morphemes (or, roughly speaking, words) are the smallest of the meaning- 
ful units. In all languages, morphemes have an internal structure composed of 
smaller meaningless segments, the phonemes (Bloomfield, 1933). Since the 
communication of meanings ultimately rests on a foundation of phonemic 
structure, a basic part of the task of understanding how speech is perceived is to 
discover the conditions for the perception of phonemic categories.” For present 
purposes, we will ignore meaning and concern ourselves only with the phonemic 
message—that is, with the perception of syllables and their phoneme segments, 
familiar to us as the consonants and vowels. 

In speech, as in handwriting, no two “‘signatures” are alike. In generating the 
“same” phoneme, different speakers do not produce sounds that are acoustically 
the same. Indeed, the same signal is never exactly repeated by the same speaker. 
In perceiving speech, as in identifying objects, we ordinarily regard only those 
distinctions that are critical, ignoring those that are merely incidental. While no 
one would deny that speech signals are intrinsically variable for many reasons, 
the implications of this variability for perception have not been widely appre- 
ciated. In brief, they constitute a major problem in perceptual constancy. 


THE PERCEIVING MACHINE AND THE ONE-TO-ONE PROBLEM 


Although our concern is with how the human perceptual system works, it may 
help us to bring this problem into focus if we consider how a machine might 
proceed in recovering a string of phonemic segments. Consider, for example, the 
problems to be solved in designing a voice-operated typewriter. The goal of such 
an automatic speech recognition device is to type out the appropriate string of 
phonemic symbols (or perhaps standard orthographic symbols) in response to 
any speech input. In the simplest case, the only information available to the 
device will be the acoustic waveform itself. A human listener, of course, can 
usually take advantage of other sources of information, including both the 
linguistic and the situational context of the utterance.* While acknowledging the 


*The phoneme is the minimal unit by which perceivers differentiate utterances. For 
example, the word bad has three phonemic segments, /b/, /æ/, and /d/, that differentiate it 
from such words as dad, bed, and bat. In different utterances, a phoneme may be realized 
acoustically in different ways; linguists call these variants “phones.” For example, the final 
/t/ in bat might be either released (acoustically, a pause followed by a burst) or unreleased 
(no burst). The class of phones is potentially infinite, and it is arguable whether phones 
(however defined) are natural perceptual units. Our emphasis in this paper is on how the 
identity of a phoneme is perceived despite variations in its acoustic form. 

`The importance of context to human listeners has been elegantly demonstrated by 
Miller, Heise, and Lichten (1951), who found a remarkably predictable relationship between 
the amount of acoustic distortion that yields a given level of intelligibility and the 
informational redundancy of the message. For a given signal-to-noise ratio, intelligibility was 
greater for words heard in sentence context than for words heard in isolation. 
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importance of context, we should not overlook the fact that listeners can 
identify arbitrarily chosen words and nonsense syllables with high accuracy 
when listening conditions are favorable. In other words, we are not posing an 
unrealistic problem for our hypothetical device; human listeners can do remark- 
ably well when little contextual information is available. 

Many attempts have been made in recent decades to design a voice-operated 
typewriter, but the problem has thus far proved elusive. Despite a degree of 
success with severely restricted vocabularies when words are spoken by one of a 
few trained talkers, a generally useful speech recognizer continues to be unat- 
tainable. As Hyde (1972) notes, “there are still no devices which can perform 
even moderately well on normal (conversational) speech in normal (noisy) 
environments by a normal range of talkers [p. 399] .” 

It is worth considering for a moment how the operation of a speech recogni- 
tion system has typically been conceived. As in many other automatic pattern 
recognition devices, the procedure involves two stages. In the first stage, the 
basic units or segments are located, For example, in automatic reading of print, 
this would correspond to isolating individual letters. In the second stage, each 
segment is identified as an instance of one of a fixed set of objects. In the case of 
print reading, this would correspond to identifying a segment as a particular 
letter of the alphabet. Thus a successful voice-operated typewriter would have to 
be able to perform two operations on the acoustic waveform of any speech 
signal. First, it would have to divide the waveform into acoustic segments that 
have a one-to-one correspondence through time with the sequence of phonemes 
in the utterance. Second, it would have to detect the presence or absence of 
acoustic features that are critical for identifying particular phonemes. This 
second stage is often conceived of as requiring a set of filters, each filter being 
tuned to a critical acoustic property (defined along dimensions such as fre- 
quency, intensity, and duration). 

This strategy implies certain widespread assumptions that only in recent years 
have been successfully challenged. For example, we find that in the standard 
accounts of speech acquisition, it is tacitly assumed that speech consists of a 
collection of elementary sounds “transparent” to the infant, such that he or she 
automatically recognizes a parent’s utterance of /d/ as “the same sound” as his 
or her own utterance of /d/ (Watson, 1924; Allport, 1924). Similarly, taxonomic 
linguists working in the tradition of Bloomfield (1933) supposed that all lan- 
guages are sampled from a common inventory of sounds (phones). Working from 
phonetic transcription of a large number of utterances as a base, these linguists 
developed highly successful procedures for determining which sound contrasts 


played a role in any particular language. It was believed that the great practical 


success of transcription as a tool for language description rested ona narrow 
definition could be given for each 


physical base, and that, in principle, an acoustic ; à 
phone. In this view, speech was conceived as a kind of sound alphabet, in which 
each phone is conveyed by a discrete package of sound with a characteristic 


318 D.SHANKWEILER, W. STRANGE, AND R. VERBRUGGE 


spectral composition. The pervasiveness of this assumption has been noted by 
Denes (1963): 


The basic premise of (most speech-recognition) work has always been that a one-to-one 
relationship existed between the acoustic event and the phoneme. Although it was 
recognized that the sound waves associated with the same phoneme would change 
according to circumstances, there was a deep-seated belief that if only the right way of 
examining the acoustic signal was found, then the much sought-after one-to-one rela- 
tionship would come to light [p. 892]. 


In fact, the perceptual skills that underlie phonetic transcription have never 
been explained well enough that an algorithm could be written to permit a 
machine to do the job. From our present perspective it is clear why no one has 
been able to develop a voice-operated typewriter based on the strategy outlined 
above. First, there are no clearly bounded segments in the acoustic waveform of 
roughly phonemic size; that is, there are no acoustic units available for setting up 
a correspondence with phonemes. Second, even if boundaries are arbitrarily 
imposed on the continuous signal, the segments corresponding to a particular 
phoneme often vary considerably in their acoustic composition. Moreover, any 
one of those acoustic segments, transferred to a different phonemic environ- 
ment, might be heard as a different phoneme altogether. Not only do the 
physical attributes specifying a particular phoneme vary markedly, but the same 
physical attribute can specify different phonemes depending on the context. 


THE CONTINUOUS SIGNAL DOES NOT REVEAL THE SEGMENTATION 
OF THE PHONEMIC MESSAGE 


The conclusions stated above are the results of three lines of investigation begun 
in the mid 1940s and continuing to the present. We turn now to review briefly 
the nature of the evidence. 

Of special importance from our standpoint are a series of tape-cutting and 
tape-splicing experiments that had the effect of shaking the general confidence 
that phonemes are conveyed by isolable bits of sound. These experiments failed 
to find any way to divide the signal on the time axis to yield segments of 
phoneme size. For example, given a consonant-vowel syllable such as go, there is 
no way to cut the piece of magnetic tape so as to produce the consonant /g/ 
alone. Some vowel quality always remains. Moreover, if a consonant-vowel 
syllable is cut at some point, the consonantal portion may not be heard as the 
same phoneme when spliced to a recording of a different vowel. Schatz (1954), 
for example, found that the consonantal portion of an utterance of /pi/ was heard 
as /k/ when it was joined to the vowel /a/. Harris (1953) and Peterson, 
Wang, and Sivertsen (1958) independently concluded that assembled speech 
made by splicing together prerecorded segments is not generally intelligible when 
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the units are smaller than roughly a half syllable.* Some investigators (e.g., Cole 
& Scott, 1974a, b) continue to argue that speech perception may be based in 
large part on the detection of acoustic invariants for phonemes. They have 
claimed that a one-to-one correspondence can often be found, that spliced 
segments can preserve their identity when transferred to new phonemic con- 
texts. However, much of this apparent “invariance” disappears when one is 
careful to cut the initial segment sufficiently short so that no trace of the 
subsequent vowel remains (cf. Kuhl, 1974). 

A second important development is the study of spectrographic displays of 
speech. This work was made possible by the invention of the sound spectrograph 
(Koenig, Dunn, & Lacey, 1946) during World War II and by the general 
availability of such devices for research during the postwar years. The spectro- 
graph displays, in graphical form, the time variations of the spectrum of the 
speech wave. This representation of the sound patterns of speech is valuable for 
the information it gives about articulation. The energy in speech sounds is 
concentrated in a small number of frequency regions that appear on a spectro- 
gram as horizontal bands (called “formants”). The location of the formants on 
the frequency scale reflects the primary resonances of a talker’s vocal tract 
(Fant, 1960). Since the shape of the vocal cavity changes at the joining of 
successive consonants and vowels, the formant frequencies may be seen to 
modulate up and down as one scans a spectrogram along the time axis. However, 
efforts to locate discrete information-bearing units along the time axis have met 
with repeated failure. The phonemic and syllabic segments, which are so clear 
perceptually, have no obvious correlates in a spectrogram, as evidenced by the 
fact that spectrograms are very difficult to read, even after much experience 
(Fant, 1962; but see also Kuhn & McGuire, 1975).5 

Figure 1 shows spectrograms of two syllables, bib and bub, Note that the 
formant frequencies are nonoverlapping for the entire duration of the syllables, 
not just in the middle portion. Although the syllables differ phonemically Ge., 
perceptually) in one segment (the medial vowel), acoustically they differ 
throughout. 

Failure to find obvious acoustic cues in spectrograms led to a third line of 
investigation: a variety of experiments with synthetic speech, produced by 
devices that place acoustic parameters under the experimenter’s direct control. 
Early work by researchers at Haskins Laboratories made use of hand-painted 


It does not follow from this result that the minimal perceptual unit is larger than the 
phoneme. The inference we would draw is that decisions about phoneme identity are made 
with regard to information distributed over the whole syllable (and sometimes, perhaps, 
over a number of syllables). 

5In remarking on the difficulty of rea 
spectrogram does not represent the relevant pho! 
has readier access to the brain’s phonemic decoder than the eye. 


ding spectrograms our point is not that the 
nemic information, but rather that the ear 
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“u ” 


bub 


kilohertz 


Air 


am 


Isec lsec 


FIG. 1. Spectrograms of tokens of two syllables that differ in the vowel. Note that the 
formant pattern of “bib” does not coincide with that of “bub” in any portion. 


patterns resembling spectrograms that were converted into sound by a photo- 
electric device, the Pattern Playback (Cooper, Liberman, & Borst, 1951). The 
Pattern Playback and subsequent computer-controlled electronic synthesizers 
have made it possible to do analytic studies in which one parameter is varied at a 
time, to determine which parameters were critical for particular phonemes. Only 
through systematic psychophysical experimentation of this sort has it been 
possible to locate the linguistically relevant information in the speech spectrum 
(Cooper, Delattre, Liberman, Borst, & Gerstman, 1952; Liberman, 1957; Liber- 
man, Cooper, Shankweiler, & Studdert-Kennedy, 1967; Liberman, Harris, Hoff- 
man, & Griffith, 1957). A major conclusion of this research is that, in general, 
there is no simple one-to-one correspondence between perceptual units and the 
acoustic structure of the signal. To be successful, synthetic speech must encode 
the information for phonemes into acoustic patterns at least a half syllable or 
full syllable in length. 

These findings make it possible to understand why the design of a voice- 
operated typewriter proved so difficult. Phonemes are not merely joined acousti- 
cally; they overlap so that two or more are represented simultaneously on the 
same stretch of sound. Conversely, segmentation is impossible because informa- 
tion for one phoneme is usually spread over wide stretches of the signal. Even if 
segmentation were attempted, the cues isolated would be radically different in 
another phonemic environment. As shown in Fig. 1, all four segments corre- 
sponding to /b/ would differ markedly in slope and formant frequency range. 

In sum, the radically context-dependent structure of speech dooms to failure 
the kind of pattern recognition procedure outlined above. A procedure that 
combines a prior stage of segmentation with an analysis of segments by a set of 
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tuned-filter “detectors,” operating independently and in parallel, will be insuffi- 
cient as a job description of an automatic recognition machine (and insufficient 
as a model of human speech perception as well). One contemporary approach to 
the recognition problem (Mermelstein, 1974) is explicit on this point, acknowl- 
edging that a speech recognizer would have to extract information about 
component phonemes over longer stretches of the signal than a syllable and 
would have to incorporate rules about how that information is distributed. The 
system described by Mermelstein does not assume that the segmentation and 
labeling problems are independent. 


SYLLABLE NUCLEI AS TARGETS 


A commonly suggested strategy for speech recognition (Fant, 1970) is to classify 
first those phonemes that are most “transparent” in the signal, and then to use 
that information as a basis for determining what the more contextually variable 
phonemes are. For example, the research described earlier found that the 
acoustic form of many consonants is heavily dependent on the coarticulated 
vowel. If the vowel were easily detected in speech signals, then it could be 
identified first and used as a basis for disambiguating the neighboring consonant. 

There are several reasons for supposing that vowels could be extracted readily 
by a routine based on a filter bank, though it would not be possible, in general, 
for consonants. In productions of sustained vowels, the positions of the for- 
mants on the frequency axis serve roughly to distinguish the vowels of a 
particular talker. For a high front vowel such as /i/, the first and second formants 
are spaced far apart (on the frequency axis) while the second and third for- 
mants are spaced close together. For a high back vowel /u/, the pattern is 
just reversed. These patterns can be synthesized with a combination of steady- 
state resonances which simulate the formants found in spectrograms of natural 
vowels. Synthetic stimuli are readily labeled as vowels by listeners, and it is 
possible to generate the full complement of English vowels with two or three 
formants (Delattre, Liberman, Cooper, & Gerstman, 1952). 

This acoustic characterization of vowels as steady-state entities is reinforced by 
articulatory considerations. The flow of speech is marked by a rhythmic pattern 
of syllables; each syllable contains a vowel “nucleus” that is usually coarticu- 
lated with one or more consonants. It is usual to think of consonants as 


the dynamic component of speech, since they are generally produced by 


movement of the articulators, and to regard vowels as the static component, 
al tract configuration and 


since they may be produced with a stationary voc 
sustained indefinitely. 

This contrast is emphasized by the concept of an idealized vowel as a 
prolonged, static entity defined (acoustically) by the frequencies of the first two 


or three formants, that is, by the primary resonances of the stationary vocal 
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tract. At least for individual talkers, then, there should be a distinctive set of 
frequency values associated with each vowel, in contrast to the variable values 
associated with the talker’s consonants. In that case, vowels should be retrievable 
by a simple two-stage recognition procedure: it would be a straightforward 
matter to detect the presence of steady states electronically (thereby isolating 
vowel segments for analysis), and a filter bank could then determine which set of 
critical frequencies the vowel sound best fits. 

Unfortunately for this approach, the apparent simplicity of vowels is largely an 
illusion. Vowels in natural continuous speech, unlike the artificially prolonged 
vowels of the phonetics laboratory, are not generally specified by steady states 
at all. Let us see why this is so. 

First, as a result of coarticulation, vowels are encoded into the structure of a 
full syllable. The imprint of the vowel is not localized but is smeared throughout 
the entire temporal course of the syllable. Thus, information about a vowel is 
available in the transitions as well as in the steady-state portion (if, indeed, a 
steady state is even attained). This was clear in the earlier example of the 
syllables bib and bub; in both cases, the vowel affected the spectral pattern of 
the entire syllable. Moreover, the acoustic properties of the vowel nucleus may 
be affected by coarticulated consonants. Measurements by House and Fairbanks 
(1953) and Stevens and House (1963), for example, indicated consistent changes 
in the duration, fundamental frequency, and formant frequencies and intensities 
of vowels, depending on consonantal context. It should also be noted that 
consonants can affect the structure of neighboring vowels if (by the phonologi- 
cal rules of the dialect) a distinction between two consonants is actually 
manifested by a difference in the neighboring vowels. For example, the /d/ in 
rider is distinguished from the /t/ in writer by the increased duration of the 
vowel that precedes it. Similarly, in some dialects of English, a nasal phoneme, 
such as the /n/ in pants, is realized by a nasalization of the preceding vowel; thus 
the spectral structure of the vowel /2/ will vary markedly depending on whether 
pats or pants is spoken. Because the coarticulation effects between consonants 
and vowels do not operate in one direction alone, but are two-way effects, there 
is no obvious acoustic invariant that characterizes a vowel in all consonantal 
contexts. 

A second major source of variance in the acoustic structure of vowels is the 
tempo of articulation. During rapid rates of speech, steady-state configurations 
may never be attained at all. Acoustic analysis of rapid speech supports the 
hypothesis of articulatory “undershoot,” since syllable nuclei often do not reach 
the steady-state formant frequency values characteristic of vowels in slowly 
articulated syllables (Lindblom, 1963; Stevens & House, 1963). Lindblom and 
Studdert-Kennedy (1967) found that listeners showed a shift in the acoustic 
criteria that they adopted for vowels (i.e., there was a shift in the phoneme 
boundary between them) as a function of perceived rate of utterance. Ap- 
parently, human listeners compensated for this simulated articulatory under- 
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shoot by perceptual overshoot. These data show that formant transitions, which 
are generally understood to carry consonantal information, may also aid in 
specifying the vowel. Thus, in ordinary speech, vowels, like consonants, are 
dynamic entities that are scaled by the pace of speaking. 

A third source of acoustic variation in vowels is associated with the individual 
characteristics of the talker. We perceive this variation directly when we identify 
persons on the basis of voice quality. On the other hand, such individual 
variation is irrelevant and becomes “noise” when our intent is to recover the 
linguistic message. Inasmuch as formant frequencies reflect vocal tract dimen- 
sions, it is obvious that the absolute positions of the formants will not be the 
same for a child as they are for an adult. The extent of the problem is suggested 
by Joos (1948): 

The acoustic discrepancies which an adult has to adjust for when listening to a child 


speaker are nothing short of enormous—they commonly are as much as seven semitones 
or a frequency ratio of 3 to 2, about the distance from /€ / to /U/ [p. 64]. 


Somehow, in spite of this, we manage to understand small children’s speech 
reasonably well and they ours. This is especially remarkable given that the 
difference between children and adults cannot be described by a simple scale 
factor. The vocal tract not only increases in size but changes in shape, and the 
consequent changes in the acoustic output are correspondingly complicated. 
Indeed, Fant (1966) has argued that the assumption of an invariant relation 
between formants one and two for a given vowel is just as untenable as the 
assumption that the absolute formant frequencies of the vowel are invariant for 
all talkers. Thus, the relation between utterances of a syllable by an adult anda 
small child is not the multiplicative relation that obtains between versions of a 
melody played in different keys. 

Having discussed variation based on physical differences in the sound-produc- 
tion apparatus, we should also mention differences that are social in origin, 
reflecting local variations of dialect within the larger language community. These 
variations are associated, of course, with geographical region, ethnic group, and 
socioeconomic class. Additional sources of talker-related variation are idiosyn- 
cratic speech mannerisms, emotional state, and fatigue. These sources, in addi- 
tion to those we have discussed above, pose enormous difficulties to the design 
of an automatic recognition device.‘ 

No one, to our knowledge, has seriously considered how an automatic speech 
recognition routine would adjust its criteria to compensate for the variations 
associated with coarticulation, tempo, and talker. When we consider the magni- 
tude and variety of variations that we take in our stride as perceivers, we begin 
to realize something of the complexity of the relations between the signal and 

é Reflect, too, on the variety of transformations of the signal that might be produced by 
the commonplace feats of talking with food in the mouth, with a cigar between the lips, or 
with teeth firmly clamped on a pencil (cf. Nooteboom & Slis, 1970). 
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the phonemic message. The difficulties encountered by the task of machine 
recognition command a new respect for the subtlety and versatility of the 
human perceptual apparatus and lead us to a new appreciation of the abstract 
nature of speech perception. 


WHAT SPECIFIES A VOWEL? 


The idea that vowels can be defined as fixed sets of steady-state values is an 
oversimplification that bears little relation to the structure of natural speech. We 
have found it necessary to reopen the question of what specifies a vowel, and we 
wish to introduce some recent findings as a case study in the problem of 
perceptual constancy. 

Vowels, as we noted earlier, are traditionally defined by formants. Vowel 
quality is associated with concentrations of acoustic energy in a few relatively 
narrow portions of the frequency spectrum; energy in the regions between these 
bands is generally weak and has little perceptible effect on vowel quality. In 
distinguishing among vowels, the lowest two formants are traditionally thought 
to be the most significant; the contribution to perception of the third and higher 
formants is problematical. For this reason, vowels are customarily represented as 
points located in a two-dimensional space defined by the first and second 
formants. As a result of variations among talkers, the points in this acoustic 
vowel space are actually regions. A critical question for perceptual theory is: 
How much or how little do these regions overlap? 

A thorough assessment of this question was made by Peterson and his col- 
leagues (Peterson, 1951; Peterson & Barney, 1952), who obtained spectrographic 
measurements of tokens of ten American English vowels produced by 76 talkers 
(including men, women, and children), Figure 2 (page 337), which is redrawn 
from Peterson and Barney (1952), shows the vowel space defined by measure- 
ments of formants one and two (F, and F3). We note that there is considerable 
overlap in some regions. In running speech we might expect a comparable 
analysis to show still more overlap, The findings showed not only lack of 
invariance in the position of the formants in children and adults, but also 
considerable average differences between men and women and considerable 
variation among talkers of the same age group and sex. 

In his pioneering monograph on acoustic phonetics, Joos (1948) had discussed 
earlier the dilemma that such variation poses for theories of speech perception. 
If different spectra are heard by listeners as the same vowel, on what does the 
judgment of sameness depend? It cannot, he concludes, be due to any evidence 
in the sound: 


Therefore the identification is based on outside evidence. .. . If this outside evidence 
were merely the memory of what the same phoneme sounded like a little earlier in the 
conversation, the task of interpreting rapid speech would presumably be vastly more 
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difficult than it is. What seems to happen, rather, is this. On first meeting a person, the 
listener hears a few vowel phones and on the basis of this small but apparently sufficient 
evidence he swiftly constructs a fairly complete vowel pattern to serve as background 
(coordinate system) upon which he correctly locates new phones as fast as he hears 
them . . . [Joos, 1948, p. 61]. 


Thus, in Joos’ view, the listener calibrates the talker’s vowel space on the basis of 
a small subset of sample utterances. The listener needs some reference points to 
define the range and distribution of the talker’s vowels. These reference signals, 
Joos suggests, could be supplied by extreme articulations (in terms of tongue 
height and point of tongue contact). Thus, Joos (1948) leaves no doubt that the 
coordinate system he has in mind is based at least in part on a model of the vocal 
tract: 
The process of correctly identifying heard vowel colors doubtless in some way involves 
articulation. A person who is listening to the sounds of a language that he can speak is 
not restricted to merely acoustic evidence for the interpretation of what he hears, but 
can and probably does profit from everything he “knows,” including of course his own 
way of articulating the same phones [Joos, 1948, p. 61]. 


Since the publication of Joos’ work, a normalization step in speech perception 
has been assumed by virtually everyone who has written on the subject, whether 
or not the writer accepted Joos’ version of the motor theory of speech percep- 
tion (or, indeed, any other version of motor theory). The idea that a listener 
makes use of reference vowels for calibration of a talker’s vowel space has also 
persisted. Gerstman (1968) and Lieberman (1973), whose contributions we will 
discuss presently, each have taken up the reference vowel idea and defended it. 
What is surprising, given this overwhelming consensus, is how few attempts have 
been made to measure the ambiguity in perception of vowels that is directly 
attributable to talker variation. 

Joos’ (1948) statement of the constancy problem implies that, for an un- 
known talker, isolated syllables should be highly ambiguous from the standpoint 
of the perceiver. It is perplexing, then, that two experiments that directly 
measured the perceptual ambiguity of natural speech found little support for 
this prediction. Peterson and Barney (1952), to whom we are indebted for 
systematic acoustic measurements of individual differences in vowel formants, 
also attempted to assess the perceptual consequences of the variation they 
discovered in production. The same recorded utterances used in making the 
spectrographic measurements were also assembled into listening tests. Listeners 
had to identify tokens of ten vowels in /h-d/ consonantal environment; the set 
consisted of heed, hid, head, had, hod, hawed, hood, who'd, hud, and heard. 
Syllables produced by groups of 10 talkers (men, women, and children) were 
randomly mixed on each listening test, insuring that opportunities for normaliza- 
tion would be slight. Although plots of the first two formants show that 
the regions occupied by these vowels overlap considerably, perceptual judgments 
were remarkably accurate, with 94% of the words perceived correctly. A 
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similarly low error rate for perception of a larger set of vowels, which included 
diphthongs, was reported by Abramson and Cooper (1959). 

These perceptual data do not support the notion that a single syllable, spoken 
in isolation, is necessarily ambiguous. Apparently, the information contained 
within a single syllable is usually sufficient to allow whatever adjustment for 
individual talker characteristics might be required. However, the history of the 
research following Peterson and Barney’s (1952) study shows that this conclu- 
sion was not generally drawn. 

The work of Ladefoged and Broadbent (1957) and Ladefoged (1967) is widely 
cited as evidence that the listener has relative criteria for vowel identification 
and that the identity of a vowel depends upon the relationship between the 
formant frequencies for that vowel and the formant frequencies of other vowels 
produced by the same talker. As a result, vowel space must presumably be 
rescaled for each voice a listener encounters. The Ladefoged and Broadbent 
(1957) study was designed to find out whether subjects could be influenced in 
their identifications of a test word by variations in an introductory sentence 
preceding it. Synthetic speech was used in order to gain precise control over the 
acoustic parameters. A set of test syllables of the form b-vowel-t was prepared 
on the synthesizer. Listening tests were made up in which the test words were 
presented following a standard sentence: Please say what this word is, Variants 
of this sentence were produced by shifting the frequencies of the first or second 
formants up or down. Each was intelligible despite wide acoustic differences. 
The results showed that the same test word was identified as bit when preceded 
by one version of the test sentence (i.e., one “voice”) and as bet when preceded 
by a second version (another “voice”) in which the first formant varied over a 
lower range. The authors conclude from perceptual shifts such as this that the 
identification of a vowel is determined by its relation to a prior sample of the 
talker’s speech (provided here by the test sentence). 

Ladefoged (1967) interprets these findings within the framework of Helson’s 
(1948) adaptation level theory. This theory attempts to account for the ex- 
traordinary efficiency of the compensatory mechanisms that achieve con- 
stancies, such as color constancy under changing illumination, by supposing that 
the perceiver scales his responses not to the absolute properties of each stimulus, 
but according to the weighted mean of a set of stimuli distributed over time. The 
introductory sentence, in the Ladefoged and Broadbent experiment (1957), is 
understood as providing the standard or anchor, thus creating an internal 
adaptation level to which the test words are referred. We shall return to the 
adaptation level hypothesis presently, after we have introduced some relevant 
findings of our own. 

If it is true that a listener needs a sample of speech in order to fix the 
coordinates of a talker’s vowel system, we need to know how large a sample is 
required and whether particular vowels are more effective than others as “an- 
chors.” As we noted earlier, Joos (1948) believed that the best reference signal 
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would be one that allows the listener to determine the major dimensions of the 
talker’s vocal tract. We therefore suggested that the “point vowels” /i/, /a/, and /u/ 
might be the primary calibrators of vowel space, since they are the vowels 
associated with the extremes of articulation. Lieberman and colleagues (Lieber- 
man, Crelin, & Klatt, 1972; Lieberman, 1973) agree that these vowels probably 
play an important role in disambiguating syllables produced by a novel talker. 
They note that the point vowels are exceptional in several ways: they represent 
extremes in acoustic and articulatory vowel space, they are acoustically stable 
for small changes in articulation (Stevens, 1972), and they are the only vowels in 
which an acoustic pattern can be related to a unique vocal tract area function 
(Lindblom & Sundberg, 1969; Stevens, 1972). 

Gerstman (1968) has made one of the few direct attempts to test the idea that 
a subset of vowels can serve to calibrate a talker’s vowel space and reduce errors 
in recognition of subsequently occurring vowels. Gerstman developed a com- 
puter algorithm that correctly classified an average of 97% of the vowels in the 
syllables produced by the Peterson and Barney panel of 76 talkers. For each 
talker’s set of ten utterances, the program rescaled the first and second formant 
frequency values of each medial vowel, taking the extreme values in the set as 
the endpoints. Since these extreme formant values are typically associated with 
jil, lal, and /u/, the procedure corresponds to a normalization of each talker’s 
vowel space with reference to his own utterances of the point vowels. The 
classification system was essentially a filter bank that classified the vowels 
according to the scaled values for the first two formants, and the sums and 
differences of these values. By inserting the normalization stage between seg- 
mentation and classification, Gerstman’s program succeeded in reducing by half 
the errors in classification made by human perceivers (recall that Peterson & 
Barney’s, 1952, listeners made 6% errors). We must keep in mind, however, that 
a successful algorithm is not a perceptual strategy, but only a possible strategy. 
Although it is of interest that such an algorithm can, in principle, serve as the 
basis for categorization of the signals, we are aware of no evidence that the 
human perceptual apparatus functions analogously. For example, it would be 
necessary to demonstrate that humans scale individual formants and calculate 


sums and differences between them. í 
far outstripped the data bearing on the 


It seemed to us that speculation had 0 
vowel constancy problem. In fact, the few studies of perceivers’ recognition of 
natural (as distinguished from synthetic) speech indicated that isolated syllables 
spoken by novel talkers are remarkably intelligible. Therefore, it seemed impor- 
tant to verify Ladefoged and Broadbent’s (1957) demonstration with natural 
speech in which all the potential sources of information that are ordinarily 
available to perceivers are present in the signal. Similarly, Gerstman s (1968) 
success in machine recognition using the three point vowels as calibrating signals 
needed to be evaluated against the performance of human listeners. Accordingly, 


we designed experiments to determine the size of the perceptual problem posed 
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for the listener when the speaker is unknown. This involved a comparison of 
perceptual errors, under matched conditions, when the test words were spoken 
by many talkers and when all were produced by only one talker. We also sought 
to discover whether certain vowels (e.g., the point vowels) have a special role in 
specifying the coordinates of a given talker’s vowel space. 


HOW DOES A LISTENER MAP A TALKER’S VOWEL SPACE? 
AN EXPERIMENT TO DETERMINE THE SIZE OF THE PROBLEM 


We (Verbrugge, Strange, & Shankweiler, 1974) first attempted to measure the 
degree of ambiguity in vowel perception attributable to lack of congruence of 
the vowel spaces of different talkers. To measure this, we presented listeners 
with unrelated words or nonsense syllables, so that broad linguistic context 
would make no contribution to the act of identification. Our studies of this 
problem were fashioned after Peterson and Barney (1952). We presented nine 
vowels in the consonantal environment /p-p/; thus, the set consisted of /pip/, 
/pıp/, [pE p/, [pæp/, /pap/,/pop/,/pup/, /pap/.and /pup/. In one listening test, the 
listeners heard tokens spoken by 15 different voices (five men, five women, five 
children), arranged in random order. To determine what proportion of per- 
ceptual error is due to uncertainty about the talker, as opposed to other sources, 
a second listening test was employed in which a single talker uttered all the 
tokens on a given test. The talkers were given no special training. They were 
urged to recite the syllables briskly in order to bring about some undershoot of 
steady-state targets. Our objective was to achieve conditions as similar as possible 
to normal conversational speech. 

Listeners misidentified’ an average of 17% of the vowels when spoken by the 
panel of 15 talkers, and 9% when each of three tests was spoken by a single 
individual (a representative man, woman, and child from their respective 
groups). The difference between these two averages, 8%, is a measure of the 
error attributable to talker variation. Although this is a statistically significant 
difference [¢(50 df) = 5.14, p < .01], its absolute magnitude is surprisingly 
small, Less than half the total number of errors obtained in the variable-talker 
condition can be attributed to talker variation. 


7We have taken the intended vowel of the talker as the criterion of correct identification. 
That is, we have defined an identification error as a response by the listener that does not 
correspond to the phonemic category intended by the talker. It might be the case that errors 
so defined are as much due to mispronunciation as to misperception. No correction was 
given to talkers during recording other than to clarify orthographic confusions in a few 
instances. In the case of the youngest children, some coaching was required before they 
pronounced the nonsense syllables. However, no adult models were provided immediately 
prior to utterances that were included in the tests. 
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Listeners can identify vowels in consonant-vowel-consonant (CVC) syllables 
with considerable accuracy even when they are spoken by an assortment of 
talkers deliberately chosen for vocal diversity. The intended vowel was identified 
on 83% of the tokens in a test designed to maximize ambiguity contributed by 
vocal tract variation.® In a second study, listeners were asked to identify 15 
vowels (monophthongs and diphthongs) spoken /h-d/ context by 30 talkers. 
Here, the rate of identification errors was 13% overall and 17% for the nine 
monophthongs alone. The results of both studies are in essential agreement with 
earlier perceptual data reported by Peterson and Barney (1952) and Abramson 
and Cooper (1959), although the error rates obtained by these investigators were 
even lower than those we obtained.” 

These findings do not bear out the common assumption of a critical need for 
extended prior exposure to a talker’s speech. The information contained within 
a single syllable appears to be sufficient in most cases to permit recognition of 
the intended vowel; familiarity with a voice seems to play a rather small role in 
the identification process. 


- ARE THE POINT VOWELS USED BY LISTENERS 
AS AIDS TO NORMALIZATION? 


We (Verbrugge, Strange, & Shankweiler, 1974) next examined the possibility 
that an introductory set of syllables increased the likelihood that a succeeding 
vowel produced by the same talker was correctly recognized. Because we wished 
to test a specific hypothesis about the stimulus information required for 
normalization, we did not employ an introductory sentence as Ladefoged and 
Broadbent (1957) had done. Instead, we introduced each target syllable by three 
precursor syllables; this provided three samples of the talker’s vowels and little 
else. In one condition of the experiment, the precursors were /hi/, [ha], and /hu/; 
these syllables contain examples of the talker’s point vowels. For a second 
condition, we chose /ht, ha, ha/, a set of nonpoint vowels that (like the point 


e tokens containing different vowels. These tokens 


® Each of the 15 talkers spoke only thre } 
han eight intervening tokens spoken by different 


were separated in the test by no fewer t 
talkers, Listeners were unable to judge how many talkers were included ona test. 

9We suspect that these studies made somewhat less severe demands on listeners’ percep- 
tual capacities than our own. In the Peterson and Barney (1952) study, listeners heard only 
10 different talkers on a particular test. Each talker spoke 2 tokens of each of 10 /h-d/ 
syllables. The study yielded an overall error rate of 6%. The Abramson and Cooper (1959) 
study employed eight talkers, each of whom spoke one token of 15 /h-d/ syllables. The 
overall error rate in that study ranged from 4 to 6%. An additional source of perceptual 
difficulty in our tests is the fact that Jaj and /ə/ are homophonous in the dialect of most of 


our talkers. 
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vowels) are quite widely separated in the space defined by the first and second 
formants. 

Neither set of precursor syllables brought about a systematic reduction of 
perceptual errors in identifying the target vowels. The errors in each precursor 
condition averaged 15% (compared to 17% in the earlier condition without 
precursors), but in neither case does this difference approach significance.'? The 
principal effect of the /hi/, /ha/, /hu/ precursors was to shift the pattern of 
responses somewhat, some vowels showing improved identification, others show- 
ing poorer identification. 

The idea that normalization is specifically aided by the point vowels, as 
suggested by Joos (1948), Gerstman (1968), and Lieberman (1973) is not 
supported by these data. In fact, no precursor syllables that we tried were found 
to have a systematic effect. A single, isolated syllable is usually sufficient to 
specify a vowel; prior exposure to specific subsets of vowels could not be shown 
to supply additional information. It would seem unnecessary to invoke a 
psychophysical weighting function in order to establish an internal adaptation 
level (cf. Ladefoged, 1967). We may surmise that the isolated syllable is not so 
ambiguous an entity as is sometimes implied. 

Because of the repetitive and stereotyped manner in which the precursors were 
presented, some readers might be inclined to doubt whether listeners made full 
use of the phonetic information potentially available and therefore to question 
whether these experiments are adequate to test the hypothesis. We can reply to 
this objection indirectly by referring to a further experiment in which the same 
precursor syllables did produce a measurable effect on perception of a subse- 
quent target syllable. This experiment involved the same 15 talkers as the earlier 
experiment, but differed in that the test syllables were produced in a fixed 
sentence frame: The little /p—p/’s chair is red. Each talker was instructed to 
produce the sentence rapidly, placing heavy stress on the word chair. The 
destressed, rapidly articulated /p—p/ syllables were excised from the taperecord- 
ing and assembled into the two new listening tests. In one condition, the /p—p/ 
targets were prefaced by the same tokens of the /hi, ha,hu/ precursors employed 
in the previous experiment. On this test, listeners made an average of 29% errors 
in identifying the vowels in the target syllables. In the other condition, no 
precursors were present. On this test, listeners misidentified 24% of the same 
vowels. Thus, misperception of target vowels occurred with significantly greater 
frequency when they were preceded by precursors [t(35 df) = 2.88, p <.01]. We 
may suppose that the precursors impaired recognition of succeeding vowels in 
this instance because they specified a speaking rate slower than that at which the 


1° This result was confirmed in a separate study of 15 vowels in /h-d/ context. When no 
precursors were present, errors averaged 13%. When each /h-d/ syllable was preceded by the 
syllables, /kip/, /kap/, /kup/, 12% of the responses were errors. The difference was not 
significant. 
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/p-p/ syllables were actually produced. Thus, whereas we failed to find evidence 
for effects of precursors on normalization of vocal-tract differences, we do find 
evidence for adjustment to a talker’s tempo (as hypothesized by Lindblom & 
Studdert-Kennedy, 1967), on the basis of preceding segments of speech. 


THE ROLE OF FORMANT TRANSITIONS IN VOWEL PERCEPTION 


Our results suggest that the identity of a vowel in a syllable spoken by a new 
talker is likely to be specified by information within the syllable itself. The 
phonetic context supplied by preceding syllables apparently serves a function 
other than that of adjustment for a new set of vocal tract parameters: it may 
enable the perceptual system to gauge the tempo of incoming speech and to set 
its criteria accordingly. We were encouraged by these preliminary findings to 
look for the sources of information that specify the vowel within the syllable, 
and to explore how that information is used by the perceiver in the process of 
vowel perception. 

As we noted earlier, the formant transitions in a syllable vary systematically as 
a function of both the consonant and the vowel. Therefore, we might expect 
that the listener utilizes information contained in the transitions in recovering 
the identity of the medial vowel. Research on the identification of isolated 
steady-state vowels (i.e., vowels that are not coarticulated with consonants) 
indirectly supports this expectation. Perception of isolated vowels is notably 
unreliable. Fairbanks and Grubb (1961) presented nine isolated vowels produced 
by seven phonetically trained talkers to eight experienced listeners. The overall 
identification rate was only 74%; rates for individual vowels ranged from 53% to 
92%. Slightly better identification of isolated vowels was obtained by Lehiste 
and Meltzer (1973) for three talkers, where, again, talkers and listeners were 
phonetically skilled. Fujimura and Ochiai (1963) directly compared the identi- 
fiability of vowels in consonantal context and in isolation. They found that the 
center portions of vowels, which had been gated out of CVC syllables, were less 
intelligible in isolation than in syllabic context. i 

Research bearing on this question has also been done with synthetic speech. 
Millar and Ainsworth (1972) reported that synthetically generated vowels were 
more reliably identified when embedded in an /h-d/ environment than when 
acoustically identical steady-state target values were presented in isolation. 
Finally, Lindblom and Studdert-Kennedy (1967) noted that listeners used dif- 
ferent acoustic criteria to distinguish pairs of vowels depending on whether 
judgments were made on isolated vowels or on the same vowel targets embedded 
within a CVC environment. pas 

There are at least two ways that the transitional portions of the acoustic signal 
might provide information for vowel identity. One possibility is that transitions 
play a role in specifying talker characteristics. Since the loci of formant transi- 
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tions for a particular consonant vary with differences in vocal-tract dimensions 
(Fourcin, 1968; Rand, 1971), transitions might serve as calibration signals for 
normalization. Particularly when the phonemic identity of the consonants is 
fixed and known to the listener, the transitions might serve to reduce the 
ambiguity of the vowel by providing information about vocal-tract characteris- 
tics of the talker who produced the syllable. 

We may also envision a second possibility that is at once more general and 
more parsimonious: the acoustic specification of vowels, like consonants, is 
carried in the dynamic configuration of the syllable. In other words, the syllable 
as a whole cospecifies both consonants and vowel. In this view, transitions may 
be regarded as belonging to the vowel no less than to the consonants. If this were 
true, we would expect that the perception of medial vowels would be aided by 
the presence of consonantal transitions regardless of whether the perceiver 
encounters many talkers on successive tokens or only one. 

To make an experimental test of these possibilities, we (Strange, Verbrugge, & 
Shankweiler, 1974) constructed a new set of listening tests that contained a 
series of isolated vowels. In one condition the vowels were spoken by the same 
panel of 15 talkers described above. In a second condition, a single talker 
produced the full series of vowels. Together with the earlier tests with |p-p/ 
syllables, these materials allowed us to compare the relative effects on vowel 
identifiability of two major variables: presence or absence of consonantal en- 
vironment and presence or absence of talker variation within a test. This also 
placed us in a position to evaluate the alternative hypotheses about how 
consonantal environment contributes to vowel perception. 

According to either hypothesis, we would expect that the perception of 
isolated vowels would be less accurate than the perception of medial vowels on a 
listening test in which the tokens are produced by different talkers. However, 
the two alternative hypotheses generate different expectations concerning the 
error rate on isolated vowels and medial vowels when the talker does not vary 
within a test. If the advantage of consonantal environment is due to use of 
transition cues for normalization, we could expect to obtain no difference 
between performance on these two conditions, because in neither case is there a 
need for repeated calibration. Therefore, we would expect that vowel recogni- 
tion would be as accurate for the isolated vowels as for the medial vowels. If, on 
the other hand, the consonantal environment provides critical information for 
the vowel independent of talker-related variation, we would expect a difference 
in consonantal environment to affect performance whether or not talkers vary 
within a test. Thus, we would expect identification of isolated vowels to be less 
accurate than medial vowels even for tests in which the talker did not vary. 

The results for the isolated vowel tests support the latter hypothesis. The 
average error in the variable-talker condition was 42% (compared to 17% errors 
on the comparable test in which vowels were spoken in /p-p/ environment). This 
increase in errors is consistent with either hypothesis. However, the results for 
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the single-talker condition also showed a large increase in errors when there was 
no consonantal environment. The average error in the single-talker condition was 
31% for the isolated vowels (compared to 9% errors on medial vowels). Moreover, 
a vowel-by-vowel comparison showed that for every vowel in both talker condi- 
tions, the error rate on the isolated vowel was greater than on the corresponding 
medial vowel. Both major variables (Consonants Present versus Absent and 
Talker Variation Present versus Absent) were shown to produce significant 
differences in overall errors [F(1, 94 df) = 125.17 and 21.18, respectively, p < 
01]. The decrease in accuracy of vowel recognition due to the absence of 
consonantal environment was approximately the same whether talkers varied or 
not (i.e., the analysis showed no significant interaction between variables). We 
may surmise, therefore, that consonantal transitions do not aid in specifying a 
vowel by providing information for a normalization stage. On the contrary, these 
results indicate that the presence or absence of transitions is much more critical 
for accurate recognition than the degree of experience with a talker’s vocal-tract 
parameters. Whereas the presence of within-test talker variation impairs recogni- 
tion by only about 8%, the absence of a consonantal environment impairs 
performance by more than 20%. 

The possibility cannot be overlooked, however, that the relatively poor percep- 
tion of isolated vowels is attributable primarily to the talkers’ inability to 
produce them reliably. Since isolated vowels do not occur in natural speech 
(with a few exceptions), talkers may produce them in peculiar ways, with 
formant frequencies uncharacteristic of the values found in natural syllables. 
Also, the characteristic relative durations of the vowels (Peterson & Lehiste, 
1960) might not be preserved by talkers in their productions of isolated vowels. 

To investigate these possibilities we undertook spectrographic analysis of the 
tokens of isolated vowels and medial vowels used in our listening tests. Center 
frequencies of the first three formants and vowel duration were measured for all 
the tokens in the variable-talker tests, as well as for tokens of all nine vowels 
spoken in isolation by each of the 15 talkers. The data provided no evidence that 
the isolated vowels were produced in an aberrant manner. Average formant 
frequencies for men, women, and children correspond quite closely to those 
reported by Peterson and Barney (1952) (for vowels in /h-d/ environment), with 
the exception of /9/.!! When the formant frequencies of each talker’s isolated 
and medial vowels are compared, the values are found to be highly similar. 
Measurements of vowel duration also fail to account for the increased error rate 
for isolated vowels. Although the durations of these were for the most part 
longer than the vowels in Ip! environment, the relative durations of the nine 
isolated vowels are much the same as the relative durations of vowels in 
consonantal environment. We may suppose, therefore, that the higher error rate 


11 This deviation is due to a dialect difference between our group of talkers (predomi- 


nantly natives of the upper Midwest) and Peterson and Barney’s group. 
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tions for a particular consonant vary with differences in vocal-tract dimensions 
(Fourcin, 1968; Rand, 1971), transitions might serve as calibration signals for 
normalization. Particularly when the phonemic identity of the consonants is 
fixed and known to the listener, the transitions might serve to reduce the 
ambiguity of the vowel by providing information about vocal-tract characteris- 
tics of the talker who produced the syllable. 

We may also envision a second possibility that is at once more general and 
more parsimonious: the acoustic specification of vowels, like consonants, is 
carried in the dynamic configuration of the syllable. In other words, the syllable 
as a whole cospecifies both consonants and vowel. In this view, transitions may 
be regarded as belonging to the vowel no less than to the consonants. If this were 
true, we would expect that the perception of medial vowels would be aided by 
the presence of consonantal transitions regardless of whether the perceiver 
encounters many talkers on successive tokens or only one. 

To make an experimental test of these possibilities, we (Strange, Verbrugge, & 
Shankweiler, 1974) constructed a new set of listening tests that contained a 
series of isolated vowels. In one condition the vowels were spoken by the same 
panel of 15 talkers described above. In a second condition, a single talker 
produced the full series of vowels. Together with the earlier tests with /p—p/ 
syllables, these materials allowed us to compare the ‚relative effects on vowel 
identifiability of two major variables: presence or absence of consonantal en- 
vironment and presence or absence of talker variation within a test. This also 
placed us in a position to evaluate the alternative hypotheses about how 
consonantal environment contributes to vowel perception. 

According to either hypothesis, we would expect that the perception of 
isolated vowels would be less accurate than the perception of medial vowels on a 
listening test in which the tokens are produced by different talkers. However, 
the two alternative hypotheses generate different expectations concerning the 
error rate on isolated vowels and medial vowels when the talker does not vary 
within a test. If the advantage of consonantal environment is due to use of 
transition cues for normalization, we could expect to obtain no difference 
between performance on these two conditions, because in neither case is there a 
need for repeated calibration. Therefore, we would expect that vowel recogni- 
tion would be as accurate for the isolated vowels as for the medial vowels. If, on 
the other hand, the consonantal environment provides critical information for 
the vowel independent of talker-related variation, we would expect a difference 
in consonantal environment to affect performance whether or not talkers vary 
within a test. Thus, we would expect identification of isolated vowels to be less 
accurate than medial vowels even for tests in which the talker did not vary. 

The results for the isolated vowel tests support the latter hypothesis. The 
average error in the variable-talker condition was 42% (compared to 17% errors 
on the comparable test in which vowels were spoken in /p-p/ environment). This 
increase in errors is consistent with either hypothesis. However, the results for 
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the single-talker condition also showed a large increase in errors when there was 
no consonantal environment. The average error in the single-talker condition was 
31% for the isolated vowels (compared to 9% errors on medial vowels). Moreover, 
a vowel-by-vowel comparison showed that for every vowel in both talker condi- 
tions, the error rate on the isolated vowel was greater than on the corresponding 
medial vowel. Both major variables (Consonants Present versus Absent and 
Talker Variation Present versus Absent) were shown to produce significant 
differences in overall errors [F(1, 94 df) = 125.17 and 21.18, respectively, p < 
01]. The decrease in accuracy of vowel recognition due to the absence of 
consonantal environment was approximately the same whether talkers varied or 
not (i.e., the analysis showed no significant interaction between variables). We 
may surmise, therefore, that consonantal transitions do not aid in specifying a 
vowel by providing information for a normalization stage. On the contrary, these 
results indicate that the presence or absence of transitions is much more critical 
for accurate recognition than the degree of experience with a talker’s vocal-tract 
parameters. Whereas the presence of within-test talker variation impairs recogni- 
tion by only about 8%, the absence of a consonantal environment impairs 
performance by more than 20%. 

The possibility cannot be overlooked, however, that the relatively poor percep- 
tion of isolated vowels is attributable primarily to the talkers’ inability to 
produce them reliably. Since isolated vowels do not occur in natural speech 
(with a few exceptions), talkers may produce them in peculiar ways, with 
formant frequencies uncharacteristic of the values found in natural syllables. 
Also, the characteristic relative durations of the vowels (Peterson & Lehiste, 
1960) might not be preserved by talkers in their productions of isolated vowels. 

To investigate these possibilities we undertook spectrographic analysis of the 
tokens of isolated vowels and medial vowels used in our listening tests. Center 
frequencies of the first three formants and vowel duration were measured for all 
the tokens in the variable-talker tests, as well as for tokens of all nine vowels 
spoken in isolation by each of the 15 talkers. The data provided no evidence that 
the isolated vowels were produced in an aberrant manner. Average formant 
frequencies for men, women, and children correspond quite closely to those 
reported by Peterson and Barney (1952) (for vowels in /h-d/ environment), with 
the exception of /o/."! When the formant frequencies of each talker’s isolated 
and medial vowels are compared, the values are found to be highly similar. 
Measurements of vowel duration also fail to account for the increased error rate 
for isolated vowels. Although the durations of these were for the most part 
longer than the vowels in /p-p/ environment, the relative durations of the nine 
isolated vowels are much the same as the relative durations of vowels in 
consonantal environment. We may suppose, therefore, that the higher error rate 


" This deviation is due to a dialect difference between our group of talkers (predomi- 
nantly natives of the upper Midwest) and Peterson and Barney’s group. 
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for isolated vowels compared to that for vowels in a fixed consonantal environ- 
ment cannot be explained on the grounds that isolated vowels tend to be 
produced in an aberrant manner. 

The message of these perceptual data is clear: Isolated, sustained vowels, 
although they correspond well to the phonetician’s idealized conception of a 
vowel, are poorly specified targets from the standpoint of the listener.'? Lehiste 
and Peterson (1959) found that many hours of practice were needed by 
untrained listeners before they could identify isolated vowels accurately, even 
when the tokens were painstakingly produced by a single phonetically trained 
talker. The ability to identify these “ideal” vowels may be a highly specific skill 
with little relevance to the identification of vowels in natural speaking situations. 

At this point the objection might still be raised that the tests used to measure 
the perceptual difficulty of medial vowels are unrepresentative of natural condi- 
tions. One possibility is that there may be an advantage associated with con- 
sonantal context of the context is known beforehand (/p—p/ in this case), but 
that this advantage would be largely eliminated if the identity of neighboring 
consonants were unknown (as is often the case in natural speech). To test this 
possibility, we constructed a listening test in which the target vowels were 
enclosed by a variable consonantal environment. A panel of 12 talkers (a subset 
of the original 15) spoke a series of consonant-vowel-consonant syllables. In 
each syllable, one of the six stop consonants (/b, d, g, p, t, k/) appeared before 
the vowel and one of the six appeared after the vowel; consonants were selected 
so that each occurred equally often in each position. One group of listeners was 
asked to identify only the vowel in each test token; a second group was asked to 
identify the two consonants as well as the vowel. The average error in identifying 
the vowels was 22% for the first group and 29% for the second. Both error rates 
are well below the 42% error rate obtained on the variable-talker test with 
isolated vowels. In other words, even when listeners do not know the identity of 
either the consonants or the vowel, recognition is significantly more accurate for 
medial vowels than for isolated vowels. 

A second possible objection to the earlier tests with medial vowels might be 
that syllables spoken in isolation (in “citation form”) are unrepresentative of the 
syllables found in rapid, connected speech. The medial vowels in rapidly spoken 
syllables might be at least as difficult to identify as isolated vowels, since the 
vocalic portions of such syllables often fail to reach the steady-state values 
characteristic of syllables spoken in citation form. The study reported earlier in 
this chapter bears directly on this question. When /p-p/ syllables spoken in 
destressed position were excised from a carrier sentence and assembled into a 


12 The formant space is less compressed for isolated vowels than for medial vowels, Thus, 
if the values of static first and second formants were the primary carriers of vowel quality, 
isolated vowels should be better perceived because their acoustic values are more widely 
separated. 
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listening test, listeners made an average of 24% errors in perceiving the medial 
vowels. This is not much greater than the 17% error rate for perception of /p-p/ 
syllables read from a list, but is substantially less than the 42% error rate for 
isolated vowels. One might have guessed that the brevity of the short /p-pl 
syllables and their failure to reach steady-state values would make them more 
difficult to identify than isolated vowels, which are longer in duration and more 
stable acoustically. Apparently, the presence of a consonantal environment more 
than compensates for these difficulties. 


CONCLUDING REMARKS ON THE PROBLEM OF VOWEL CONSTANCY 


Let us consider what we have learned about how the perceiver might achieve 
constancy of vowel quality. In our studies of vowel perception the objective was 
to isolate sources of vocalic information in the natural speech signal. We 
employed signals that presented as many characteristics as possible of normal 
conversational speech, including a representative range of signal variations that 
result from physical differences among talkers. 

Each way of conceptualizing the vowel contains an implied solution to the 
problem of perceptual constancy. We first considered the assumption that the 
vowel can be characterized by a steady-state output of the vocal tract, and that, 
to a first approximation, fixed formant loci are associated with each vowel 
quality of all speakers. To the extent that this assumption is correct, the 
constancy problem is trivial. Only minor adjustments for variation would be 
required. 

We saw that this conception of the vowel as a simple acoustic event, segmented 
in time and in spectral frequency composition, was widely shared among 
students of speech, including those who initiated earlier attempts at automatic 
speech recognition. We have reviewed a number of findings that are incompatible 
with this view. First, steady states are the exception, not the rule, in continuous 
natural speech. As a result of coarticulation of vowels with preceding and 
following consonants, the syllable is not discretely partitioned, and the informa- 
tion for the vowel is smeared throughout the syllable. Moreover, the variability 
occasioned by the phonemic environment of a vowel is compounded by the 
changes that accompany different speaking rates and different vocal tract sizes. 
In retrospect, it is easy to see why attempts to design a generally useful speech 
recognition machine have so far failed. 

A more sophisticated conception of the vowel acknowledges the problem of 
variability but continues to assume that vowels, even in running speech, can be 
perceived with reference to a single set of acoustic values. This view proposes 
that tokens of the “same” vowel fall on a line in vowel space defined by the first 
and second formants. The formant frequencies of two talkers’ vowels would 
then be constant multiples of one another. We noted that this relationship could 
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not literally hold, because vocal tracts differ in shape as well as in size. This rules 
out an analogy to a melody played in a different key, or to a magnetic tape 
recording played back at a different speed. The failure of these analogies is 
revealed by Peterson and Barney’s (1952) measurements of first and second 
formant frequencies in men, women, and children. The results of the measure- 
ments (displayed in Fig. 2) showed wide disperson of formant values for 
different speakers with considerable overlap of formants for neighboring vowels. 
Even when one considers only those tokens on which perfect agreement was 
obtained by listeners, much scatter among formant values is observed (cf. 
Peterson & Barney’s, 1952, Fig. 9). 

Failing to find the invariant relation preserved by linear scaling, investigators 
have sought a transformation that might yield a closer approximation. For 
example, it has become an accepted practice to plot units of frequency (Hz) on a 
scale of mels (Peterson, 1961; Ladefoged, 1967).'* Transformation of formant 
frequencies to mels might be defended on the grounds that this unit reflects the 
response of the auditory system to frequency. However, we are skeptical that 
the constancy problem can be illuminated by a search for the right scale factor. 
Ladefoged (1967), who attempted to reduce variability by employing phoneti- 
cians as talkers, concluded that separation of all vowels cannot be attained by 
scaling the first- and second-formant frequencies, whether in linear fashion or 
nonlinearly, as on a scale of mels. 

Although no one has succeeded in demonstrating a generally applicable scaling 
(normalizing) function, it is widely assumed that perceivers must apply such a 
function to each new talker they encounter. There has been speculation about 
the minimal stretch of speech required for calibration. Ladefoged’s (1967) 
application of adaptation-level theory to the problem of speaker normalization 
reflects the common assumption that some extended sample of a new talker’s 
utterances is required for determining the weights that enable the normalizing 
adjustments to be made. As we noted, Joos (1948) and Lieberman (1973) 
proposed that ambiguity of a new talker’s utterances can be resolved by 
reference vowels that permit the perceiver to construct a model of the talker’s 
vowel space, scaling the input according to parameters derived from these 
calibration vowels. 

Listeners can apparently adjust their criteria for perception of synthetic vowels 
according to the formant ranges specified by a precursor sentence (Ladefoged & 
Broadbent, 1957). The successful performance of Gerstman’s (1968) nor- 
malizing algorithm indicates that frequencies of the first and second formants 
could, in principle, suffice for this purpose. However, we doubt, as does 
Ladefoged (1967) himself, that first and second formant frequencies exhaust the 


13 A mel is a psychophysical unit reflecting equal sense distances of pitch and bearing an 
approximately logarithmic relation to frequency for frequencies above 1000 Hz (Stevens & 
Volkmann, 1940). 
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FIG. 2. First- and second-formant frequencies of American-English vowels for a sample of 


76 adult men, adult women, and children. The closed loops enclose 90% of the data points 
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sources of information that specify a vowel in a natural speech signal. '* 


Moreover, the fact that listeners can perceive randomly ordered syllables ac- 
curately indicates that there is little need for a mechanism that requires a sample 
of several syllables in order to construct a normalization schema. Finally, in our 
own experiments with natural speech, we failed to find that point vowel 
precursors, or another set of widely spaced vowels, brought about a systematic 
improvement in recognition of the following vowel. 

Our results do not, therefore, support the view that vowels are relational values 
in a metric space that must be scaled according to other vowels produced by the 
Same talker. If that theory were correct, it is difficult to see how precursors 
could fail to improve recognition of an immediately following medial vowel. The 


** This is also Peterson’s (1961) conclusion, based upon studies of filtering. 
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presence of coarticulated consonants within the syllable proved far more useful 
for categorizing natural vowels than prior experience with a talker’s utterance. In 
sum, our studies failed to provide supporting evidence for current conceptions of 
the normalization process. They force us to consider whether there is justifica- 
tion for a separate and preliminary normalization stage in speech perception.'® 

A major difficulty with all the proposals stated above is that they view the 
invariance problem in terms of the relation among formant frequencies of 
relatively sustained vowels. Even if such efforts to discover an algorithm were 
successful, they would not be sufficient to explain perception of vowels in 
natural conversational speech because in such utterances a region of steady-state 
energy is rarely present in the signal. The presence of sustained acoustic energy 
at certain “target” frequencies is not essential for identification of a vowel under 
natural listening conditions. On the contrary, there is evidence that changing 
spectral patterns are much superior to sustained values as carriers of vowel 
quality. We cited earlier reports that isolated steady-state vowels are poorly 
perceived, even after listeners were given substantial training and when the target 
vowels were spoken by phonetically trained talkers (Fairbanks & Grubb, 1961; 
Lehiste & Meltzer, 1973). The results of our perceptual studies definitely 
confirm the perceptual difficulty of isolated vowels. Listeners misidentified 31% 
even when all items within a given test list were produced by the same talker. 
Moreover, vowels coarticulated with surrounding consonants, as is normal in 
Tunning speech, were considerably more intelligible than isolated vowels spoken 
by the same talkers (e.g., 9% of vowels in /p-p/ environment were misidenti- 
fied). It seems unlikely, therefore, that the perceptual system operates by 
throwing away information contained in formant transitions. Indeed, Lindblom 
and Studdert-Kennedy (1967), in studies with synthetic speech, demonstrated 
that listeners use this information directly in their placement of vowel phoneme 
boundaries. Vowel identifications varied with direction and rate of transitions 
even when the formant frequency values at the syllable centers were held 
constant. In short, it is futile to seek a solution to the constancy problem by 
analysis of any acoustic cross section taken at a single instant in time, and we 
must conclude that the vowel in natural speech is inescapably a dynamic entity. 

Heuristic procedures for automatic recognition of consonants often begin by 
guessing the identity of the coarticulated vowel, since it is known that the 
specific shapes of the formant transitions are conditioned by the vowel. The 
vowel is assumed to be a stable reference point against which the identity of the 
consonant may be determined. But this, of course, presupposes that the vowel is 
more directly available than the consonant. We have now examined a number of 
indications that the problem of perceptual constancy may be no less abstract for 


15 But see Summerfield and Haggard (1973). These investigators measured an increase in 
reaction time to synthetic syllables from different (simulated) vocal tracts which they 
interpret as reflecting extra processing time required for a normalization stage. 
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the vowel than for the consonant. We found that isolated steady-state vowels 
provided especially poor perceptual targets. Moreover, there were not sub- 
stantially more errors in identifying the medial vowels of rapidly spoken syl- 
lables (where steady states were presumably not attained), compared to errors 
on medial vowels in syllables read from a list. It is clear that we can no longer 
think of defining vowels in terms of acoustic energy at characteristic target 
frequencies. A recognition device based on filters tuned to specific frequencies is 
as unworkable for vowels as it is for consonants. If the idea of the vowel target is 
to be retained, it must take account of the dynamic character of the syllable. 

Lindblom’s (1963) conception has this virtue. Formant contours are charac- 
terized as exponential functions that tend toward asymptotic “target” values 
associated with the vowel nucleus. Thus, a target can be defined acoustically 
even though it corresponds to no spectral cross section through the syllable. Our 
perceptual data presented here are compatible with this view that vowels are 
specified by contours of moving formants with certain invariant properties over 
stretches of approximately the length of a syllable.!© 

We may find a parallel to this conclusion in studies of speech production. 
Investigation of the manner in which phonemes are joined in the syllable reveals 
context-dependent relationships similar to those we have noted in the acoustic 
signal. Lindblom’s (1963) dynamic theory of vowel articulation is, in fact, an 
attempt to explain how contextual influences on the acoustic and phonetic 
properties of vowels are produced. According to this view, undershoot in 
running speech is brought about by inertia in the response of the articulators to 
motor excitations occurring in rapid temporal succession. Invariant neural events 
corresponding to vowel targets thus fail to bring articulators to the positions 
they assume when the vowel is produced in a sustained manner. 

Lindblom’s (1963) inference of articulatory undershoot during rapid speech 
has been confirmed by cinefluorographic data (Gay, 1974). His account of the 
mechanism of undershoot has not gone unchallenged, however. MacNeilage 
(1970) presented a different view of the inherent variability of speech produc- 
tion. Basing his conclusions on extensive electromyographic studies of context 
effects in articulation, he argued that variability of muscle contraction is not to 
be understood merely as an unfortunate consequence of mechanical constraints 


16 Our understanding of the vowel has been influenced by Gibson’s (1966) approach to 
the problems of event constancy in visual perception, which is to seek regularities in the 
stimulus pattern that can only be defined over time. A similar approach is taken by Shaw, 
McIntyre, and Mace (1974), and by Shaw and Pittenger (this volume). We tend to agree with 
these authors that the dynamic invariants specifying an event may be perceived directly by 
perceptual systems that are appropriately tuned. While Lindblom (1963) and Lindblom and 
Studdert-Kennedy (1967) offer a dynamic characterization of vowels, they apparently made 
the usual assumption that only temporal cross sections can be directly perceived, and they 
supposed that vowel perception is mediated by a process of analysis-by-synthesis in which 
the dynamic invariants are used to compute possible input patterns. 
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on articulator motion, but as necessarily built into the system in order to permit 
attainment of relatively invariant target shapes. Gay (1974) cited his cinefluoro- 
graphic findings in support of MacNeilage’s hypothesis that variability of gesture 
must be regarded as a design characteristic. However, unlike MacNeilage, he 
concluded that there was variability not only of gesture, but of spatial target, 
since he failed to find invariance of vocal-tract shape for a central vowel /a/ that 
held across speaking rates and consonantal environments. The same conclusion 
was drawn by Nooteboom (1970), who argued that the kinds of reorganization 
that occur in talking with the teeth clenched make it difficult to retain the idea 
of invariant spatial targets. Just as the attainment of a specific acoustic target 
value is not necessary for the successful perception of a vowel, it is probably the 
case that the attainment of a specific target shape is not necessary for its 
effective production. Thus, in production as in perception, it has become 
increasingly difficult to entertain the notion of an invariant target for each 
vowel, as long as the meaning of invariance is restricted to a specific vocal tract 
shape and its resonances. It is likely that the units of production, like the units 
of perception, cannot be defined independently of the temporal dimension. 
Speech, viewed either as motor gesture or as acoustic signal, is not a succession 
of static states. Invariance in vowel production, then, can be discovered only in 
the context of the dynamic configuration of the syllable. 

The reader might wonder at this point whether the various productions and 
acoustic forms of a vowel are so heterogeneous that no coherent physical 
definition (however abstract) could be found that embraces them all. Perhaps 
the required invariance is not to be found in the acoustic signal at all. If it is not, 
a radical solution to the constancy problem is to suppose that the variants of a 
phoneme are physically unrelated, and to assume that the brain stores separately 
a prototype of each vowel and consonant for every phonemic environment. If we 
can extend to speech perception an argument made by Wickelgren (1969) 
concerning its production, then Wickelgren’s hypothesis of “‘context-sensitive 
allophones” is such a proposal. However, the proposal has little to recommend 
it. Halwes and Jenkins (1971) find a number of flaws, two of which are critical. 
First, the proposal fails to capture the phonological relations that are known to 
be important in understanding both the production and perception of speech. 
Second, it ignores the “creativity” inherent in the production of speech that 
permits the reorganization of articulatory movements to maintain intelligibility 
even when normal speech movements are blocked, as when talking with the 
teeth clenched or with food in the mouth, or when under the influence of oral 
anesthesia. 

In light of the evidence we have surveyed, it is obvious that attempts to 
understand the psychophysical constancy relations in speech have failed to 
discover transparent isomorphisms between signal and perception. This failure 
has led many to doubt whether a psychophysics of speech could ever illuminate 
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the constancy problem. But certainly, it does not follow from the complexity of 
the psychophysical relation that the signal fails to specify the phonemic message 
uniquely.'” The emphasis that current theories place upon the relational nature 
of the vowel is misleading because it underestimates the richness of the signal in 
natural speech, a richness that is attested by the great tolerance of the perceptual 
system for a degraded speech signal (as in noisy environments or after filtering). 
To abandon the search for acoustic invariants because the psychophysical 
relations are complex would surely be a backward step. It should be appreciated, 
however, that commitment to the principle of invariance does not bind us to a 
literal isomorphism between signal and percept. The weight of evidence conclu- 
sively opposes a one-to-one mapping of perceptual segments and their dimen- 
sions upon physical segments and their dimensions. In the case of vowels, we 
have argued that the invariants cannot be found in a temporal cross section but 
can only be specified over time.'® For vowels, as for other phonologic segments, 
a major goal of research is to discover the appropriate time domains over which 
invariance might be found. 
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Disorganization in Thought 
and Word 


Michael P. Maratsos 


University of Minnesota 


As psychologists we have an evident and reasonable preoccupation with trying to 
bring about order in our description of how human beings think and in our 
descriptions of their cognitive organizations. There is indeed much order and 
coherence to be described and analyzed; but I think that it is a simple fact that 
much of adult thinking, concept organization, reasoning, and arguing is at best 
partially coherent. Concepts are often fuzzy, reasoning is frequently faulty, and 
arguments frequently end in impasses or proceed in wrong-headed ways that are 
hard to make sense of. 

This chapter addresses the problem of disorganization—or perhaps rather partial 
organization. Much of it has clear ancestry in some of the outstanding work of 
the twentieth century in the fields of word meaning and thinking, particularly 
philosophical linguistic analysis by the figures Wittgenstein and Austin, and 
developmental analysis by men such as Vygotsky, Piaget, and Werner. All these 
figures were concemed with this problem, though the latter three tended to 
regard it as a problem almost completely confined to children (Piaget), or to 
children, savages, and the insane (Vygotsky; Werner also includes the lower 
classes); a conclusion which I think both Wittgenstein and Austin would rightly 
find much too generous to civilized adults. The chapter begins with a discussion 
of the organization of word meanings, both synchronically (as in use at one 
time) and developmentally (as they change in the course of a language history or 
in the course of acquisition). In the succeeding section the discussion turns to 
similar topics in human classification, concept formation and argumentation. 
The thesis shall be that indeed similar problems arise in both areas of discussion, 
and for noncoincidental reasons; in particular the problems will be traced to 
problems in human judgments of partial and complete similarity between two 
situations or problems. We begin with a discussion of the partial disorder to be 
found in the structure of word meanings. 
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WORD MEANINGS 


Synchronic Organization 


By synchronic organization is meant the organization of the meaning of a word 
at a single time in the language. It has long plagued philosophers and linguists 
alike that the meaning of a word often does not appear to be constant across 
uses. Nevertheless, the notion that a given word has a meaning which is constant 
if one could only ferret it out has been a common one, as noted in Austin 
(1963) and Wittgenstein (1953). Both of these writers argued, and convincingly 
so, that words do not commonly have such a single meaning. What makes it seem 
so is that often the various meanings of a word may persistently share something 
of meaning or connotation so that the ensemble seems unified. Instead, the 
meanings of words have various kinds of resemblances among each other which I 
shall try to outline below." 

One kind of resemblance, noted by Aristotle, was called by him a paronymous 
organization. To illustrate such an organization we may consider the meanings 
and uses of the words healthy, sad, and happy. In all three cases there seems to 
be a nuclear meaning which is found in the phrases healthy person, sad person, 
and happy person. The words refer to states of a person. Now, for the case of 
healthy, consider the following uses: “His complexion is healthy”—healthy 
complexion; “That exercise is healthy”—healthy exercise. In the former case, 
healthy seems to mean “symptomatic of a state of health in a person,” and in 
the latter “causative of a state of health.” Similarly for sad face or happy face 
versus sad book or happy book. Our uses of the terms are also bound by 
real-world suppositions. The phrase sad complexion sounds peculiar because we 
do not usually think of complexions as indicating sadness, though they may 
indicate health. 

Such structures are found readily in other words, though not as often as might 
be expected.” The word earn, for example, means to “‘acquire by labor, merit, or 
performance,” or “deserve by labor, merit, performance.” We may even say 
sarcastically of someone that he earns, say, $20,000 a year as a professor, but he 
hardly earns any of it, pointing out that he acquires the salary but does not 


"Nida (1958), endeavoring to show that the meanings of a word may have no relation at 
all, cites the case of charge, which may mean, among other things, what the Light Brigade 
did and what a merchant does for his goods. Such cases are commonly called homonyms, 
two separate words which happen to have the same phonological realization, and are com- 
pletely irrelevant to the point. 

*Such cases occasionally fail to generalize in surprising ways. For example, one does not 
use the expression jealous face, to mean a face which seems to express jealously (compare to 
sad face, happy face, healthy complexion), or at least the usage is a strained one. And there 
seems to be no use of jealous to mean “causative of” (as in healthy, in healthy exercise), 
although such usages are easy to conceive. 
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merit it. Meanings may revolve around a central meaning in other ways. Bronfen- 
brenner (1960) pointed out that Freud used the term identification to mean both 
the process of identifying with someone and the outcome of the process. The 
central concept may be restricted to various extents among meanings as well. 
The term family revolves around a notion of kinship through bloodlines, but 
obviously differs in meaning in the following uses: “What position do you hold 
in your family?” “Is your family concentrated in one geographical area, or do 
they live all over the country?” “When did your family first come to America?” 
In the first case the reference is usually taken to be to the nuclear family: father, 
mother (husband, wife), and children. The second question can refer to the 
living members of the known wider family—cousins, uncles, and so on. In the 
third question, family clearly means a group of people united by a line of 
descent which may go back generations. A word such as day shows similar 
variation around a central notion of light-dark shift. It may refer to the 24-hr 
period which contains both day and night, or to just the light portion, or even to 
the working portion of a day, as in “how was your day?” 

A word may also contain meanings which seem to spread further and further 
apart around various central meanings. The word command, as a verb, has a set 
of complexly related meanings. It may mean to issue an order from a position of 
authority, as in “he commanded the troops to leave.” It may also mean having 
something at one’s bidding or disposal, as in “the commands many resources.” In 
the last meaning an element of clear success is added that is lacking in “he 
commanded the troops to leave,” but the notion of directly issuing an order is 
lost. The word may also mean to have control, as in one who commands a 
situation; to have a competence as in having command of a language, and so on. 
The various meanings seem to cluster around a notion of secure position or 
authority, with clear links among the terms; but successive meanings grow 
further away. Such relations are common. Consider heavy in “This thing is heavy 
(to carry),” “This atom is heavy (for an atom),” “This meal is heavy (hard to 
digest, dense, rich food),” “His features are heavy (massive).” 

In some cases it no longer seems that the meanings of the word necessarily 
share a common element of meaning. Rather there are a number of elements 
which keep cropping up among the words, no one of which perhaps appears in 
all. Wittgenstein called such word meanings family resemblances, noting how in 
families various facial features reappeared in different members, but often no 
one feature appeared in all. The result in such cases is a feeling of overall 
resemblance and unity without any unique unifying element. Vygotsky called 
such structures chain complexes. He noted that word meaning A might have 
something in common with word meaning B, and meaning B with meaning G 
and C with D, but different members of the chain might have almost nothing in 
common; for example, D and A might share almost nothing in common. 
Wittgenstein wittily pointed to just such a case in the German word spiel (which 
is translated into English as both play and game): it can mean what actors do on 
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the stage, what athletic players do, strategies and tricks people may indulge in 
socially and emotionally, and so on. Very commonly used terms in a language 
display such links endemically; the dictionary definitions of a word such as run, 
or in, abound in such interconnections. 

An interesting characteristic of such shared, linking meanings is that they need 
not be what would normally be considered part of the ‘essential’ meaning of 
both terms, taking ‘essential’ to be those aspects of the original referent of a 
term without which it appears the term cannot be applied. For example, “having 
shape” is an essential part of the term “hexagonal;” or similarly, “shapeless” can 
only be applied to things which have some shape (as opposed to totally 
nonconcrete things such as gravity). In contrast, friendliness is not a necessary 
part of the referent of dog, but many dogs are friendly and faithful, and for 
many speakers, dog connotes friendliness. Many links among meanings appear to 
be by such “nonessential” meaning links. For example, heavy in heavy features 
probably shares a link with normally heavy objects through the frequent mas- 
siveness of the latter. A particularly good example of such links can be found in 
slang terms such as screw. The most noncolloquial meaning of the verb is “to 
drive or tighten a screw; to fasten, tighten, or attach by or as if by means of a 
screw” (American Heritage Dictionary, 1969). By some resemblance of the 
actions involved, perhaps, screw has evolved to also stand for sexual intercourse 
(the noncolloquial meaning originated as a French importation into English, and 
so was probably earlier). Most likely because of traditional sex-role considera- 
tions, screw in the sexual sense has a strong contextual connotation of one 
person taking advantage of the other, which seems to provide a link to another 
meaning of screw, to take advantage of someone or cheat them. 

Such links by common characteristic situational overtone or empirical cor- 
relate, as discussed above for screw and heavy, characterize nearly any com- 
monly used verb or adjective, and many nouns. Metaphor commonly plays on 
such characteristic qualities of words and situations. For example, for someone 
to “light someone else’s (spiritual) way” means to make it possible for them to 
comprehend things. Light in general is associated with making it possible to see, 
and so forms many characteristic metaphorical links. People may also use 
connotation and context in everyday speech, in speaking of things such as a 
“true son” whereby they mean a son who characteristically does what sons at 
their best should do. 


Developmental and Diachronic Data 


Data from children’s semantic acquisition is at yet not widely available, but even 
so instances exist that are of interest. Labov (1973), in intensive observation of 
his own infant’s early language development, found that the initial organization 
of her first two words, cat and mama, indeed has a complexive nature. The child 
used mama to refer to any member of her family, and also to refer to visitors 
who stayed long enough at the house to become friendly. That the concept had 
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a stronger core of meaning revolving around her actual mother, however, was 
demonstrated by the fact that when asked where mama was, the child always 
pointed to her actual mother. The word cat was also used most frequently and 
vociferously of actual cats, particularly to the one at home. But the infant also 
applied the word to objects which shared aspects with true cats, such as furry 
objects, statues of animals, and so forth.? Schulte, cited in Preyer (1890) similarly 
cites the case of a child who first used the word ass to refer to a goat with rough 
hide that stood on wheels. The word was then used to refer to moving things, 
and then extended to include things with a rough surface. The infant seems to 
have formed an initial concept of the disjunction of two elements (that is, 
“moves” or “rough surface”). Maratsos (1974) has found that older preschool 
children (more so than younger ones) employ the environmentally correlated 
element of the height of an object (distance from its top to a common floor) in 
their definitions of the words big and tall. If a taller object was compared with a 
clearly shorter one which was raised to a higher point, children would often pick 
the higher object as the tall one; under some conditions a small rabbit toy lifted 
above a much larger elephant was judged to be the “big one” of the pair, 
although under normal conditions gross size defined the meaning of big. 

Aside from basically synchronic data like that cited above, the developmental 
literature also shows children extending the meaning of a term in a complex-like 
fashion at times. In the study by Maratsos cited above, the tendency to 
incorporate the element of highness grew with age in preschool children. Chain- 
complex-like word meaning changes may be found in diary accounts of infant 
semantic development. Piaget (1962) observed one of his children developing the 
following meanings in turn for no more; anything going away; anything over- 
turned; anything at a distance; said as the child held something for an adult to 
throw; said when the child wanted an object someone was holding; said to mean 
“begin again.” Another of Piaget’s children applied bow-wow first to dogs, then 
to a dog seen from their balcony, then eventually to anything (e.g., prams, 
neighbors, hens, dogs, horses) seen from the balcony. A child studied by 
Pavlovitch (1920, cited in Clark, 1973) first used the word bébé to refer to her 
own reflection in a mirror. The word was subsequently applied in new uses to a 
photograph of herself, all photographs, all pictures; all books with pictures, and 
finally to all books. Romanes (1888, cited in Clark, 1973, and in Vygotsky, 
1962) reports an infant who first used quah to refer to a duck in water. She then 
used it to refer to her milk in a bottle (liquid), to birds (ducks are birds), a coin 
with an eagle on it, and finally for a while to all coins. 


3Labov, in systematic cross-sectional work, also demonstrates a similar (though less 
scattered) organization of adults’ meaning for the word cup. The perceptual appearance of 
an object carries the greatest weight—something that looks just like a coffee cup is judged to 
bea cup no matter what its use. As the appearance approaches that of a bowl (by 
lengthening one edge more so that it looks more like a lip) the use to which the object is put 
becomes more important in speaker’s judgments. Something between a cup and bowl in 
appearance will be judged to be a cup if it holds coffee, but a bowl if it holds vegetables. 
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Extensive documentation also exists in diachronic linguistics (historical 
changes in languages) of meaning shifts of a complexive nature. The shift may 
occur through a shared core of meaning. The Latin verb pilare in archaic Latin 
meant “to be covered with hair.” By the imperial period it had come to mean 
“to depilate, to pull out the hair.” Often meanings have branched off from older 
ones by becoming first more specific: such a case is that of true, which originally 
meant “faithful, loyal,” as in “true to one’s word,” “true to one’s family.” With 
time it has also come to mean accurate, conformable to fact. 

Quite frequently newer meanings are related to older ones, as in synchronic 
organization, by a link of emotional overtone or more general, nonessential 
meaning. The German word Elend once meant (as Ellent) “foreign land.” 
Perhaps from contexts such as “banished into foreign lands,” it acquired the 
meaning of “exile,” and subsequently because of the emotional overtones of 
exile, its present meaning of “misery.” The German word ungefahr, employing 
both links of essential meaning and of contextual connotation, has changed 
successively from an original meaning of “without danger” to mean “without 
bad intent,” “without intent,” “incidentally,” and finally “approximately 
about.’ It is quite likely that present synchronic word organizations have 
resulted by the accretion of such new meanings without the loss of older ones, as 
in the case of true above. 

On the whole, then, there seem to be ample similarities among the synchronic 
links among word meanings, children’s semantic organizations, and the change of 
word meaning both in children’s acquisitions and in new word usages in language 
histories. In the following section we shall try to develop the thesis that similar 
kinds of organizations may be found in concept formation tasks and reasoning 
tasks in both children and in adults. We shall then attempt to elaborate the 
hypothesis that some of the similarities of organization among word structure, 
word change, and conceptual thinking are not coincidental (although some 


probably are), and analyze more fully the underlying cognitive basis for these 
problems. 


COGNITIVE ORGANIZATION 


The first task of this section is to document the presence of complexive 
structures in human concept formation and reasoning. A major part of the 
support is to be found in work with children’s concepts and concept elucidation. 


*Word changes may take even more surprising courses. Vygotsky (1962) cites the case of 
the Russian word sutki, which once simply meant “seam.” It came with time to mean 
“twilight,” the juncture of day and night. With more time it came to mean the twenty-four 
hour period between twilights, and now means the equivalent of the English day as a 
twenty-four hour period of time. Similarly, the word shambles changed from “table” to 
“butcher table” to “scene after a very bloody battle” to “scene after a very messy battle” to 
its present meaning. 
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Both Piaget (Inhelder & Piaget, 1969) and Vygotsky, employing slightly differ- 
ent methods, have found that children frequently classify shapes in complexive 
manners. Vygotsky (1962) essentially employed a nonsense-word concept for- 
mation task. His subjects were presented with an array of objects of different 
shapes—rectangles, squares, cylinders; different sizes—large, small; and different 
colors. On the bottom of each block was a nonsense syllable, such as yiv. Classes 
of objects predetermined by the experimenter had different names: Yiv might be 
placed at the bottom of all of the large triangles. At the beginning of the study, 
the experimenter would hold up one such shape and read off its name. He then 
suggested that the subject try to put together other shapes of the same name. 
Adults typically make a well-organized guess and stick to it. If a large yellow 
triangle had been held up, an adult might guess that yiv referred to all of the 
yellow shapes, and collect them together. When the subject was finished, the 
experimenter would then hold up one of the collected objects and read its name. 
For example, since yiv referred to large triangles, “yellow” as a guess would be 
incorrect, as would be shown by holding up a yellow square labeled bik. The 
subject could then try again. 

The interest lay not so much in how accurate the subject was as in what kinds 
of piles of objects he assembled. Adults, as mentioned above, typically made a 
systematic initial guess and persisted on that basis. Children behaved differently, 
in a number of ways. Early on, they might simply put together any pleasing 
array of shapes, based on momentary whims. A more advanced grouping was 
what Vygotsky (and we after him) called a chain complex. The child might, 
beginning with a small yellow triangle, next select a pair of yellow objects, a 
circle and a square. Next, however, he might change his apparent criterion for 
yiv and select a red square, which shared the characteristic of squareness with 
the previous object but no longer was yellow. Through similar chained sequences 
the child would arrive at a collection with many links among the objects but 
with no overall unifying principle. 

A still more advanced form of grouping was called an associative complex by 
Vygotsky. Given a small yellow triangle, the child would leave it in the center of 
the objects perhaps, and append to it triangles, yellow objects, or small objects, 
objects which each shared a characteristic with the initial object, but which 
themselves formed a complex. Such a structure is clearly analogous to the kind 
of word organization called paronymous (e.g., sad person, sad face, sad book). 

Inhelder and Piaget (1969), employing a task with less resemblance to a verbal 
concept formation task, obtained quite similar results. Their subjects received 
shapes of varied form and color and were asked to put the ones together that 
belonged together. Some subjects were shown hierarchically organized classifica- 
tions by adults as examples of what was meant. Nevertheless, they also obtained 
both associative and chain complexive structures. They also obtained, as a fairly 
common result, an early behavior consisting of arranging the shapes into wholis- 
tic forms such as a house. In free classification of this type children performed 
more adequately at an earlier age, arriving at unified hierarchical classification 
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schemes by the ages of six or so, as compared to Vygotsky’s finding that in his 
task, complexive organizations only ceased by adolescence. But the kinds of 
behaviors obtained led to quite similar structures. 

Children’s reasoning and explication also seem to employ complexive organiza- 
tion. The case par excellance is snycretic reasoning, discussed by both Piaget 
(1962, 1955) and Werner (1954). The basic problem of syncretic reasoning is the 
following: 


Situation A Situation B 
Characteristics a,b Characteristic a 
Therefore b 


Because of partial similarity of one situation to another, children may assume 
further similarities of one to another in an unjustified manner. Piaget (1962) has 
recorded instances of such reasoning in his younger daughter Jacqueline. 
At 2 years 10 months of age Jacqueline had a temperature and desired oranges 
to eat. It was, however, too early in the season for oranges to be obtained. It 
was explained to her that they were still green rather than orange. She seemed to 
accept the facts. But a moment later, as she was drinking her (orange) camomile 
tea, Jacqueline exclaimed “Camomile isn’t green, it’s yellow already . . . give me 
some oranges!” Similarly, Jacqueline had a friend who was a little hunchbacked 
boy, whom she sometimes went to see. At one time he was ill with influenza, 
and so she could not see him. She said “poor boy, he’s ill, he has a hump.” 
Later, on a walk, she asked whether the boy was still ill in bed. On being told 
that he was not, she said “good, he hasn’t a big hump now.” 

In the above, in what is also called tranductive reasoning, the child seems to 
teason from similarity of particulars to new similarities. For example, the 
similarity of tea and oranges in having colors and being ingestible leads to the 
conclusion that oranges are ripe because the tea is orange. Piaget (1955) also 
carried out work with older children. In this study they were given a list of 
sayings (e.g., “Filing can turn a stake into a needle.”) and a list of explications 
(e.g., “As we grow older we grow better.”). Children who occasionally made 
correct matches between sayings and explications also displayed many uses of 
syncretic reasoning in explaining the appropriateness of incorrect choices. It is 
impossible here to sample adequately from the rich store of examples Piaget 
gives. One example is given below: 


Mat (10; 0) connects the proverb “So often goes the jug to water that in the end it 
breaks” with the sentence “As we grow older we grow better.” Now the proverb has 
been understood verbally (i.e., literally), For Mat it means: “You go to the water so 
often that the jug cracks; you go back once again and it breaks.” The corresponding 
sentence is explained as follows: “The older you get the better you get and the more 
obedient you become.” Why do these two sentences mean the same thing (question of 
experimenter)? “Because the jug is not so hard because it is getting old, because the 
bigger you grow, the better you are and you grow old” [Piaget, 1969, p. 150]. 
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In their explications the children often made matches because of the possibility 
of finding some aspect of similarity between the proverb and the chosen 
explication; having found one similarity (or even participation in a common 
scheme) between the two, they picture the rest as similar as well. 

We will not discuss here related results from Werner (1954) or Bruner, Oliver, 
and Greenfield (1966), who also found that children would unite word meanings 
(Werner & Kaplan 1952) or concepts (Bruner) on the basis of various kinds of 
partial similarity and justify their unity in syncretic ways. In all these cases, 
there is an overattribution of further similarity of outcome or attribute on the 
basis of partial and often complexlike links. Adults, although the case is perhaps 
less systematically documented, clearly also employ complexive concepts and 
syncretic reasoning. Their difficulties are generally less easy to analyze because 
the concepts with which they deal are themselves more complex and abstract. As 
a result it becomes difficult to both keep in mind the concepts being reasoned 
about and their true relations and to do this well enough to see their complexive 
links, But I think that careful analysis can show entirely comparable problems in 
adult cognition. A perfectly common and applicable case is reasoning by anal- 
ogy, to which many of us are unduly wont. The dialogues of Plato (1928) in 
particular are marked by continual use of reasoning and proof by analogical 
means. In one sequence from the dialogue Euthyphro, Plato (through Socrates) 
considers the question“How should men serve the gods?” Trying to reason an 
answer to the question by beginning with clear cases, he considers a question he 
seemingly holds to be equivalent, “How do horses serve men?” The answer to 
this last question is then employed to answer the first one in an unconvincing 
manner. The problem, of course, is that the service of horses to men has some 
similarity with men’s service to the gods (and for idiomatic reasons, even the 
same verbal expressions can be used), but the similarity is clearly not enough to 
motivate the eventual conclusion of how men should serve the gods. The 
reasoning may be schematized: 


Situation A: Situation B: 
Horses serve men. Men serve the gods. 
X(horses) serve Y(men). X(men) serve Y(gods). 
X serves Y in manner Z. Therefore, as in A, X should 


serve Y in manner Z. 


Another general instance is found in an endemic form of argumentation. In 
such arguments, the point is whether two situations are the “same” (share 
enough important characteristics in common for a conclusion or outcome in one 
to be generalized to the other) or “different” (differ in so many important 
characteristics that no transfer is possible). To take a hypothetical example, 
Person 1 may argue “The United States now is just like the Roman Empire was 
when it began its fall. We have an originally energetic wealthy class, which is soft 
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now. The Roman Empire fell, and the United States will fall.” Person 2 may 
then argue that the two cases are not the same at all, because “No, it's 
completely different. Rome had no organized financial system which the United 
States does; and besides, Rome fell to the pressure of encroaching barbarians, 
and there are no encroaching barbarians now.” I think everyone has heard and 
participated in arguments like these. The first party piles up what seem to him 
important (often important for emotional reasons) similarities between two 
situations or people, claims they are “the same,” and so what was true of the 
first situation or person is true of the second. The second party brings up 
dissimilarities, claims the cases are different, and so denies the extension of 
similarity, The piling up of similarities and differences often seems to depend 
more on the psychological conspicuousness of certain aspects than on their 
relation to the main argument, 

A particularly strong characteristic of such discussions and arguments (and 
other lines of reasoning) is the insistence that two situations are either “the 
same” or “different.” Analytically it is clear that nearly any two objects or 
situations have some degree of difference. If we take same to mean “absolutely 
the same,” then nothing is the same as anything. We could hardly say anyone 
was the same person they were in any infinitesimal instant of time before, because 
of organic and psychological changes. In fact, when we say two things are the 
same, we simply mean they are the same in what seem to us important respects. 
But in many processes of conscious reasoning and argument, we act as though 
same meant “completely the same,” so that once it is shown that two situations 
are the same by similarity of important respects, then they are automatically the 
same in any other important respect. The same arguments apply to the predicate 
and concept different; the dissimilarity of a few (conspicuous) aspects may lead 
People to view two situations as different, hence “completely different.” 

Analytically, one can make valid inferences about characteristics shared by 
two situations only under the following conditions: if Situations X and Y share 
Characteristic a (or Characteristics a, b, c), and a (or a, b, c) can be shown to 
entail Characteristic 2 on logical or unimpeachable empirical grounds, then X 
and Y will both share z. Instead, people typically overattribute similarity of 
difference on what are essentially correlational grounds. They seem to believe 
that there is a magical condition of “sameness” or “differentness” which can be 
automatically decided on the basis of number or importance of similarities. (Of 
course, it is quite reasonable to guess, hypothesize, or suggest that if X and Y 
share Characteristic a (or a, b, €) in common, and X has z, then Y will also have 
2. If in the past only X has had both a (or a, b, c) and z, so that the association is 
oe unique, then it becomes even more reasonable to guess that Y also 

z 

Adults assuredly think, argue, and prove—they believe—things in this manner. 
Moving from the above general kinds of examples, I shall try to now give more 
specific examples that seem to be exemplary cases of adults reasoning about 
difficult subjects. 
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Case 1: Psychologist K. is a well-known psychologist who in lecture developed 
the following line of reasoning. Wishing to show that motives are concepts, he 
proceeded as follows: 


1, Concepts are clearly cognitions. 
2. A line of argument was given to demonstrate that motives are also 


cognitions. 
3. Therefore, motives are concepts. 


A perfectly parallel argument may be found in noting that cats are animals, dogs 
are animals, therefore cats are dogs. The similarity of motives and concepts in 
being cognitions led K. to believe that it followed that they shared enough 
further similarities to claim that motives are concepts. 


Case 2: Charles Rosen, a musicologist, has written an excellent book called 
The Classical Style (1972) about the musical style of the classical composers 
Haydn, Mozart, and Beethoven. He describes sonata form as a combination of 
thythmic, harmonic, cadential, and structural elements, Discussing a section 
from Haydn's Creation, he notes that it shares many of the attributes of sonata 
form, and concludes: 

For Haydn, the “sonata” is not a form at all, but an integral part of the musical 

language, and even a necessary minimum for any large statement that can be made 

within that language. The themes here are reduced to very small fragments, as are the 
musical paragraphs, but the proportions of a sonata movement without an isolated 
development section but with articulated expositon and symmetrical recapitula- 

tion . , . is as present as ever (Rosen, 1972, p. 370]. 


Rosen's procedure in this passage was to claim that since the musical organiza- 
tion of the discussed section employed some clements of what is called a sonata 
form, the piece is a sonata form. The presence of some elements of sonata form 
justifies uniting the section with other examples of sonata form, and attributing 
further similarities as a consequence. It is a matter of little difficulty to imagine 
the objection of another musicologist that given the elements of sonata form 
that are missing or severely transformed, the section Is hardly in sonata form 
at all. 

Within my own field (developmental psychology) there seems to me to exist 
examples of syncretic reasoning. One example, 1 think, is Piaget's claim that 
thinking and cognition consists of mental actions. This idea has, I think, 
profound influence yet insufficient basis in his work. Piaget's general strategy 
has been to demonstrate that the mind is “active” in rearranging and transform- 
ing elements for all cognitive opérations. Therefore cognition is a series of 
actions. In cases in which the claim seems particularly dubious he generally 
points to some element of activity in the situation and concludes that the 
generalization holds. For example, considering the case of seriation of objects of 
different sizes (such as the arrangement of sticks of different lengths into a series 
starting with the shortest and ending with the longest), he points out a possible 
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objection: does not the essential operation seem to be a perceptual comparison 
of heights that itself involves little activity? But, Piaget (1965) argues, when the 
child performs the task, he performs actively in moving the sticks together for 
comparison, arranging them, and the like, so clearly seriation is an action. 
Similarly, he has argued that even seeing involves active elements such as moving 
the eyes about to trace the outline of the object, so it also is an action. The 
general resemblance of such reasoning to syncretic thought seems clear. Nor is 
the problem simply a terminological one, a slight definitional extension of the 
word action, as those familiar with Piaget’s work can attest (though they might 
disagree with points in the above very briefly made analysis). Piaget argues that 
the bases of cognition must lie in the sensorimotor activities of infancy. For 
conceptual activities have active elements like those of infancy. If cognitions are 
actions, then they clearly take their origin in similar overt actions, Furth 
(1969), in his work Piaget and Knowledge, has in effect noted that there are 
indeed similarities between the organizations of early sensorimotor activities and 
later conceptual activities, but not identity enough for the former entirely to 
motivate the latter. He emphasizes Piaget’s concept of reflective abstraction, 
which seems to consist of the filtering from overt motor activities to internal 
cognitive operations of common, but not necessarily motoric, organization and 
principles. Such emphasis involves, I think, a considerable weakening of Piaget’s 
original claims. Piaget’s reasoning often appears to be clearly syncretic. His 
arguments that cognitions are actions comprise a clear case where such reasoning 
has profound effects upon his general thought, the necessary hastiness of the 
analysis offered here notwithstanding. 


CONCLUSIONS 


The cases discussed above—child and adult reasoning—provide ample (analogi- 
cal) similarities to the descriptions of the different types of word structures 
and developments discussed earilier. For clearly the verbal cases involve extend- 
ing the usage of a word (call this the additional characteristic) by means of a 


SOf course at this point I am employing one of the characteristic strategies of syncretic 
thought. It is being suggested from the similarity among the kinds of structures formed in 
the different areas of discussion that a further similarity obtains in the manner of their 
ontogenesis. There is nothing intrinsically misleading in such a procedure. As long as the 
natural tendency to attribute further similarities on the basis of existing ones does not 
become a claim of proof, the method is a fertile source of suggestions, guesses, hypotheses, 
and insights. Many human activities depend on this kind of thought in a positive way: 
activities in language such as poetry and the use of metaphor, and activities in cognition 
such as hypothesis formation. Evolutionarily, in fact, the use of continued attribution of 
similarity provides clear advantages. The basis of generalization is essentially to treat 
different situations in a similar fashion on the basis of what seem to the organism like 
criterially similar attributes among contexts. The strategy is bound to lead to false conclu- 
sions and unrewarding behaviors, but without it, little activity is possible that is not bound 
to particular situations. 


ee eee 
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partial overlap of similarity of a new possible context of usage with an old one in 
which use of the word was appropriate. For example, if the basic meaning of the 
word sad is that one uses it to describe the state of a person who is sad, 
extending it to refer to a situation or referent which is symptomatic of someone 
in the state of sadness (sad face) is to extend the use of the word on the basis of 
an overlapping linkage, the state of someone who is sad. In more extreme cases, 
if Elend, which originally meant “foreign land,” became overextended to mean 
“exile,” it was probably by means of the linkages between “foreign land,” and 
“exile.” The two are not equivalent, but may have been enough so for extension 
of the verbal usage from one situation to the other. A similar situation holds for 
the more connotationally-bound extension of Elend as “exile” to Elend as 
“misery,” in which the link is not one of essential meaning, but of conspicuous 
overlap of connotation. The claim, then, is that it is possible that people will 
extend an old word to a new situation on the basis of some link, a link which by 
the general rules of linguistic usage should not be sufficient (if we take this rule 
to be that a word should be used only in situations which fulfill exactly all the 
previous essential criteria for its use). 

At the same time, even if one person extends his use of a term, others must 
assimilate the new usage for it to become a part of the language. Here the fact 
that the formation of complexive structures characterizes thought extensively 
may be important in people’s judging others’ new usages “equivalent enough” to 
extend to the novel situation, and in their ability to adequately store the new 
use. There is little reason to doubt that people do indeed become confused and 
inaccurate over what counts as “equivalent enough.” It would have been impos- 
sible otherwise to believe they sensed unified, constant meanings in terms which 
in fact had meanings united only by family resemblances. Especially when faced 
with situations which require thinking of a slightly new type, people easily 
confuse one type of similarity or equivalence for another—for example, confuse 
the linkage characteristic of family resemblances for the linkage characteristic 
of a truly unified, single determined concept or meaning such as is occasionally 
found in scientific concepts or even in a few ordinary language concepts. People 
could thus easily accept new usages which had some relation to the old as 
equivalent enough for lexical usage, particularly if the linkage were through a 
very conspicuous element in the situation. (Quite similarly they may also be 
confused or wrongly convinced by syncretic arguments.) 

Of course, the proposals here, largely ones that seek to extend and make clear 
the analyses implied in earlier work, are only tentative ones. In various cases, 
such as that of language change, processes other than that of complexive growth 
as described above could result in similar outcomes. One widely suspected means 
of linguistic change is the practice of euphemistic reference—referring to an 
unpleasant situation with terms appropriate for a neutral or pleasant one. 
Imbecile, for example, at one time simply meant “weak”. It was apparently 
applied euphemistically to persons of “weak mind,” and with time its true 
contextual meaning came to predominate (Robertson & Cassidy, 1954). Wanton 
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similarly once meant “untaught.” Speakers do not seem confused or to over- 
estimate similarity when they refer euphemistically; the misuse is initially 
purposeful, though the change of normal meaning may not be. Another possible 
source of noncomplexive change may be found in simple misunderstanding 
between generations. In most of the above cases it has been assumed that the 
person(s) who overextended a word use correctly apprehended the original 
meaning of the term. But new permanent meanings for a word may also have 
arisen from a misunderstanding on the part of someone who previously had no 
understanding of the older meaning. For example, if after a while imbecile was 
mostly used of imbeciles, a speaker growing up and hearing the word used 
mostly in that context could easily understand it as having only the meaning 
proper to its more modern use without ever apprehending the older. Thus a 
younger generation’s understanding of the word use of the older generation may 
also be a source of meaning change. 

The existence of such alternate hypotheses can be taken as cautions that we 
cannot directly infer causal relations from the similarities among change and 
organization in thought to those in word meanings.° To extend this work it 
remains to obtain more information from naturalistic observations of language 
change and acquisition and experimental demonstrations (Werner & Kaplan, 
1952, have provided us with one study of this kind already), On the whole, 
nevertheless, the parallels between complexive and syncretic thought and the 
types of word change and organization discussed in this chapter are too clear for 
dismissal. Too many findings and observations already available stem more 
reasonably from the hypothesis of complexive thought as a causal factor for 
alternative hypotheses to be satisfactory, complete substitutes.” 


°Few chapters of this length can discuss an entire area and this is no exception. Many 
interesting problems in word meaning and word change have been completely passed over. 
Among them stand such interesting tendencies as that of people to make reference to 
someone or something by a part (Redbeard, Bluebeard as names; Redcoats to refer to British 
soldiers in the American Revolutionary War) or even by some closely associated object (The 
Crown for the English monarch and his authority; Pharoah, which originally referred not to 
the Egyptian ruler but only to the great house in which he dwelt—with time it was used to 
refer directly to the ruler himself; our own White House,which is becoming more animate 
with time, in such sentences as The White House today declared its intentions of co- 
operating fully with communist authorities in the Mideast). It seems likely that such a 
means of reference stems from a natural tendency of the human mind to tag or refer to 
fuller experiences by conspicuous or associated parts; an image which is itself incapable of 
representing the fullness of a remembered situation may nevertheless help in the memory of 
it or in keeping it in mind (Bartlett, 1932). 

7A child whose language development is being studied by the author and others has, for 
example, been observed to use the word horn (the meaning of which was previously quite 
clear to him) to mean the action of touching someone or something else with a horn, a 
quite novel meaning. He seems to have abstracted from noun-verb pairs like hammer-to 
hammer, shovel-to shovel the overly broad hypothesis (rarely applied even by him) that one 
can say fo X to mean “to do something with X.” Such a hypothesis would be impossible if 
complexive thinking were not a natural process. Werner (1954) cites similar cases from a 
German-speaking child. 
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More positively, I should like to conclude this discussion by noting an 
important and perhaps not obvious theoretical premise of most of the exposition 
in this paper, namely, that people may “know” something at one level of 
functioning, such as the true relations among situations, or the proper equiv- 
alences necessary to satisfy word use, but be confused or lax about relations at 
another level, such as word usage in a situation that is somewhat novel, or about 
judging whether a new word use is consonant with old ones enough to be 
extended, or about judging whether a word has a unified meaning. The problem 
of knowledge at one level of something known at another, I should like to 
suggest in passing, seems to me indeed a crucial one. We have often assumed in 
the past, I think, that something is known in just one way, so that any procedure 
tapping that knowledge is equivalent to any other, since all procedures tap the 
same thing. So for example, it has been claimed (Menyuk, 1971) that young 
children define words in terms of the functions of their referents, since when 
they are asked, for example, what a bottle is, they are likely to say 

“It’s something that you put stuff in.” Similarly, since children at five or six 
give word associations like the verb bark to dog, rather than another noun such 
as cat, they do not functionally use word classes like noun or verb, or organize 
words in that fashion. But there is no reason to accept such results at face value. 
Five- and six-year old children are now known to use words syntactically 
depending on their word class with great discrimination and accuracy. Using 
word classes to organize responses in word association tasks is a different 
problem. The general matter is probably best illustrated in a study of Piaget’s. 
He asked children if various words were strong or not—for example, “Is the word 
daddy strong?” Children at five or six generally answered in terms of the 
referent of the word, rather than answering that words themselves are not strong 
or weak, but simply refer. So one child answered that the word daddy is strong 
because daddies are strong, while umbrellas are weak. Some children were 
traditional—they knew better, but kept forgetting and slipping back, as the 
following passage shows: 

Aud (8; 8): “Are words strong? - No, words are nothing at all. They aren't strong, you 

can’t put anything on them.—Tell me a word.—‘Curtains,’ It isn’t strong because if you 

put anyting on it, it tears. A word isn’t strong because you can’t build up anything on 
it,—... Are there any words that are strong?—No. —Tell me another word.—‘Umbrella- 

stand.’ It is strong because you can put umbrellas init... [Piaget, 1969, p. 58]. 


Such work, and other work, has been widely cited to show children cannot 
disentangle a word from its referent. Badly stated, the proposition is of course 
nonsense. When children hear the word fire they do not form blisters on their 
skin or even flinch back as though from heat. In normal use children do not 
confuse words with their referents. But at the level of functioning required by 
being asked to consciously consider problems such as Piaget’s, they do indeed. 
There is no single, unified knowledge of word-referent relations; rather there are 
different levels of functioning at which the relation may be known. Similarly, 
the fact that six-year olds may define words in terms of function is interesting, 
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but does not display their “meaning of the word.” I have asked people, for 
example, whether the meaning of the word convince, as in “X convinced Y to do 
something,” means that Y necessarily carries out the action. They generally say 
“yes” when the question is proposed. But in fact convince does not entail such a 
definition, and people know it does not. When asked if one could say that “Mary 
convinced John to leave, but he didn’t after all because he changed his mind,” 
they agree there is nothing at all odd about such a usage, which shows that 
convince does not entail, by meaning, the performance of the infinitival action. 
(Compare to force, as in Mary forced John to leave, in which case it is 
anomalous then to state that John did not leave.) Their conscious inspection of 
their meaning for convince probably involves inspection of its normal use, in 
which the infinitival action is generally performed; but their conscious inspec- 
tions provide no direct route to the meaning they actually have stored for use of 
the term. 

So it seems reasonable to propose that someone, whether child or adult, might 
truly understand a concept or lexical item, and even generally work on the basis 
of strict equivalence of essential elements, and yet overextend the use of a word 
(or concept) in a novel use, and accept new uses as being ones that fit the old, or 
accept complexive structures as unified ones. The general problem of judging at 
one cognitive level the relations of similarity and equivalence at another is 
clearly one that deserves attention. The resulting complexive thought seems 
characteristically a source of human hypotheses, guesses, judgments, concepts; it 
seems unlikely that it is not also a characteristic source of much of the 
instability in our use, organization, and intuitions of the structure of words (see 
Footnote 7). 


ACKNOWLEDGMENTS 


The original talk from which this paper is taken was written while the author was being 
supported by grants HD-01136 and National Science Foundation Grant GB35703X. 


REFERENCES 


The American Heritage Dictionary. Boston: Houghton-Mifflin, 1969. 

Austin, J. L. The meaning of a word. In Charles Canton (Ed.), Philosophy and ordinary 
language. Urbana, Illinois: University of Illinois Press, 1963. 

Bartlett, F. Remembering. Cambridge, England: Cambridge University Press, 1932. 

Bronfenbrenner, U. Freudian theories of identity & their derivatives. Child Development, 
1960, 31, 15-40. 

Bruner, J., Olver, R., & Greenfield, P. Studies in cognitive growth. New York: Wiley, 1966. 

Clark, E. V. What’s in a word. On the child’s acquisition of semantics in his first language. In 
T. E. Moore (Ed.), Cognitive development and the acquisition of language. New York: 
Academic Press, 1973, 65-110. 


12. DISORGANIZATION IN THOUGHT AND WORD 363 


Furth, H. Piaget and knowledge. Englewood Cliffs, New Jersey: Prentice-Hall, 1969. 

Inhelder, B., & Piaget, J. The early growth of logic in the child. New York: Norton, 1969. 

Labov, W. A child’s meaning for cat and mama. Colloquium delivered at the University of 
Minnesota, 1973. 

Maratsos, M. When is a high thing the big one? Developmental Psychology, 1974, 10, 
466-475. 

Menyuk, P. The acquisition and development of language. Englewood Cliffs, New Jersey: 
Prentice-Hall, 1971. 

Nida, E. A. Analysis of meaning and dictionary making. International Journal of American 
Linguistics, 1958, 24, 279-292. 

Pavlovitch, M. Le langage enfantin: Acquisition du serbe et du francais par un enfant serba. 
Paris: Champion, 1920. 

Piaget, J. The language and thought of the child. Cleveland: World Publ., 1955. 

Piaget, J. Play, dreams, and imitation. New York: Norton, 1962. 

Piaget, J. The child’s conception of number. New York: Norton, 1965. 

Piaget, J. The child's conception of world. Totowa, New Jersey: Littlefield, Adams, 1969. 

Plato (no first initial). The works of Plato. Selected and edited by Irwin Edman. New York: 
The Modern Library, 1928. 

Preyer, W. The mind of the child, New York: Appleton, 1890. 

Robertson, S., & Cassidy, F. G. The development of modern English. (2nd ed.) Englewood 
Cliffs, New Jersey: Prentice-Hall, 1954. 

Rosen, C. The classical style: Haydn, Mozart, Beethoven. New York: W.W. Norton, & Co. 
Inc. 1972. 

Romanes, G. J. Mental evolution in man. London, 1888. 

Vygotsky, K. Thought and Language. Cambridge, Massachusetts: MIT Press, 1962. 

Werner, H. Change of meaning: A study of semantic processes through the experimental 
method. The Journal of General Psychology, 1954, 50, 181-208. 

Werner, H., & Kaplan, E. The acquisition of word meanings: A developmental study. 
Monographs of the Society for Research in Child Development, 1952, 51. 

Wittgenstein, L. Philosophische Untersuchungen (translated by D. E. M. Anscombe). Ox- 
ford: Blackwell Press, 1953. 


arean 


A ANIC 


= 
AE EEEN 


y 


U 
(S eae a7 
r J v 


13 


Resemblances in Language 
and Perception 
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University of Connecticut 


This is one thing that cannot be learned 
from anyone else, and it is the mark of great 
natural ability, for the ability to use 
metaphor well implies a perception of 
resemblances. (Aristotle, Poetics; cited in 
Hawkes, 1972) 


Accounting for the perception of resemblances has been one of the most 
important and obstinate problems facing experimental psychology. How do we 
recognize the similarity of an event to one experienced before, in spite of 
differences in detail? It is a problem of describing both what it is that we learn 
about a type of event and how it is that we detect and later recognize its 
characteristics. The interest in these questions is attested to by a huge literature 
on concept formation and identification, stimulus and response generalization, 
perception (including the perception of speech sounds and linguistic forms), 
recognition memory, learning set, and so on. 

Until recently, there has been surprisingly little research on how we apprehend 
events through the medium of linguistic forms. In particular, there has been 
almost no psycholinguistic research on the ways language is used to draw 
attention to resemblances. Analogies appear in many different forms: similes 
(“the swinging putters left behind a wake like the shining weave of a flying 
fish”—TV sports commentator), metaphors (“we live in a time when meaning 
falls in splinters from our lives” —popular song), symbols (“he believes in Mother 
and apple pie”), proverbs (“look before you leap”), and models (“imagine that 
these two atoms are billiard balls on a pool table . . . ”). Beyond these there are 
parables, fables, poems, plays, scientific theories, and other works of fiction. The 
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frequency with which we use and encounter these linguistic forms makes it 
puzzling why there has been so little study of how we comprehend them-espe- 
cially given the central role of the perception of resemblances in human cogni- 
tion. 

The neglect of metaphor by contemporary psychology is probably due to the 
field’s origins in a philosophical and rhetorical tradition which had little place 
for such uses of language. For example, Aristotle argued that the primary 
function of language is the accurate description of reality, and that the closer the 
correspondence between words and the classes of natural things, the better this 
function will be fulfilled. Poetic language is “imitative,” not descriptive, and 
thus is distinct from the languages of rhetoric and logic, whose goals are 
persuasion and clarity: 


All such arts are fanciful and meant to charm the hearer. Nobody uses fine language 
when teaching geometry (Rhetoric III; cited by Hawkes, 1972). 


The Aristotelian position might be summarized (and caricatured) as follows. 
There is a set of primitive properties by which we may classify all objects and 
events. Words refer to specific classes (concepts) defined in terms of those pro- 
perties: hence the proper use of words may be defined in objective, physical, 
measurable ways. The meaningfulness of strings of words (sentences) may be 
evaluated as if they were selfcontained logical propositions about the properties 
of things. If different parts of a sentence entail incompatible properties, then the 
proposition is illogical and meaningless. Thus metaphoric sentences are strictly 
illogical. Their vividness and novelty may prove useful, in passing, to make a 
drab speech more eye-catching, but the meaning conveyed by the speaker is 
independent of them. 

The impact of this tradition is still being felt. Many twentieth century schools 
of thought (logical positivism, behaviorism, taxonomic and transformational 
linguistics) have maintained much of the Aristotelian view of language, and thus 
have shared a discomfort with metaphor and other subjective manifestations. 
Two underlying assumptions seem most critical. One is a positivistic attitude 
that structure can be discerned and defined in publicly available ways, that is, 
that structure is independent of the observer. The second is what Pepper (1942) 
has called “immanent formism,” the belief (in part) that the structure of things 
is manifested directly in their physical form (“on the surface”), that the natural 
world can be classified by these overt physical properties. 

These attitudes have had a crippling effect on psychological study of the 
Processes by which we perceive resemblances, both in the comprehension of 
analogic language and in the direct perception of shared structure. The goal of 
this chapter is to show that these two modes of apprehending commonalities are 
intimately related, perhaps identical in some respects. The approach to be 
developed will replace positivism with an interactive and perspectivist view of 
meaning, in which the interpretive and selective activities of the perceiver are 
critical; and it will reject immanent formism in favor of the view that the 
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perceived structure of things is best defined in terms of abstract, “formless” 
relations. Taken together, these views provide an exciting basis for studies of 
metaphor, analogic thought, and imaginative processes in general. 


THE PROBLEM OF LINGUISTIC DEVIANCE 


The study of language structure took a giant step during the “linguistic revolu- 
tion” of the last twenty years (e.g., Chomsky, 1957, 1965). It was a shift, in 
Pepper’s (1942) terms, from an “immanent formism” to a “transcendent form- 
ism.” Taxonomic linguists assumed that the segmentation of strings of phonemes 
and morphemes posed no problem since the identity of the structural elements 
was apparent (“immanent”) in the surface form. Transformational linguists, to 
the contrary, asserted that there were deeper identities than were manifested on 
the surface, in at least two senses. First, there were grammatical relations which 
might not be apparent in the “surface structure”; for example, Sentences (1) and 
(2) are superficially more similar than (1) and (3): 


(1) Joan is easy to help. 
(2) Joan is willing to help. 
(3) It is easy to help Joan. 


But Sentences (1) and (3) are structurally equivalent at a deeper level, and dif- 
ferent in that respect from (2), since (4) is not equivalent to (2): 


(4) It is willing to help Joan. 


A second sense in which sentences bear a deep identity is that utterances of asen- 
tence may show variability due to grammatical errors, phonetic substitutions, hesi- 
tations, dialect differences, etc. In spite of these surface deviations from grammati- 
cal and phonological norms, we are able to recognize the underlying sentence 
which a speaker intended. This is similar to our ability to classify a particular tree 
as an oak, in spite of any deformities it may have due to local environmental 
conditions.! The concept of an oak tree, or a grammatical sentence, or a perfect 
gas, transcends any particular manifestation. 

This shift to a transcendent formism has been productive for linguistics and it 
has forced psychologists to think about how an organism might perceive struc- 
tural relations which are inherently abstract. Even so, the transformationalists 
have maintained many of the traditional assumptions: that sentences may be 
described in isolation, that sentences may be treated as self-contained proposi- 
tions in a logical calculus, that the meaning of sentences is assembled from 


1 Pepper ascribes this example to Aristotle-as-transcendent-formist. An extensive study of 
the “growth policies” underlying the appearance of natural forms has been made by 
Thompson (1942), 
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meanings associated with words, and that meanings are properties in a classifica- 
tional scheme not unlike the Aristotelian theory of types. The following argu- 
ments are an attempt to demonstrate that this residue of positivism and imma- 
nent formism makes certain linguistic theories inadequate as a basis for a 
psychology of comprehension. The inadequacies will be particularly obvious in 
the case of understanding metaphoric language. 

The boldest semantic theory, in the context of Chomsky’s (1965) transforma- 
tional grammar, is that proposed by Katz and his colleagues (e.g., Katz, 1971, 
1972; Katz & Fodor, 1963). Katz has the virtue of being relatively explicit, so 
the psychological implications of his position may be more clearly stated. Katz’s 
(1971) goal is to account for the semantic properties and relations of sentences— 
synonymy, antonymy, contradiction, meaningfulness, entailment, etc. Very 
crudely his position is as follows. The underlying deep structure of a sentence is 
the best place to define semantic as well as syntactic relations. Lexical items may 
be substituted for the nonterminal symbols at the deep structure level and thus 
will be labeled according to their role in the phrase structure (noun, verb, 
auxiliary, etc.). This disambiguates the senses of the “same” word used as 
different parts of speech (e.g., “play” may be a noun or verb). Associated with 
each potential use of a lexical item is a set of semantic markers (e.g., one feature 
of “shadow” is -Physical Object) and selection restrictions (e.g., one restric- 
tion of “waterproof” is that it can only be used to predicate on a +Physical 
Object). Thus the sentence 


(5) Shadows are waterproof. 


will have incompatible semantic markers, and Katz’s system will brand it 
“meaningless.” The meaning of a sentence is a composite of the meanings of its 
parts, and the total proposition is meaningful only if it contains no logical 
inconsistencies. 

This is really the old Aristotelian system in a new disguise. What is lost is the 
naive view that sentence structure directly reflects the reality being described; 
Katz wisely asserts that language’s surface form is a partial and ambiguous 
reflection of an “underlying reality.” But what remains is (a) a classificational 
scheme which has a reassuringly substantive and immanent feel about it; (b) a 
basis for meaning which is asserted to be independent of people’s cognitions 
(“Ideas, thoughts, cognitions, etc., like utterances, are performance phenomena, 
while meanings, like phonological features and syntactic categories, are abstrac- 
tions that form part of competence”; Katz, 1971, p. 121); and (c) a logical 
system which labels metaphor as inexplicit and meaningless (Katz does not 
discuss metaphor directly, but his attitude is exemplified in such phrases as 
“unexplicated and highly metaphorical relation” and “metaphorical and loose 
notions”). 

The immediate problem at this point is that we can make sensible interpreta- 
tions of sentences which Katz’s system would brand as meaningless. For exam- 
ple, we could interpret Sentence (5) to mean that we can throw as much water 
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at a shadow as we like, but the shadow will go unharmed. Such interpretations 
(which we readily give to metaphors in context) are bound to pose a problem for 
those who leave the processes of human imagination out of their account of 
meaning. Psychologists, who have tended to borrow on the linguists’ concepts 
and terminology, typically express a mixture of fascination and helplessness over 
accounting for “deviant” uses of language. For example, Slobin (1971) writes: 
“You are also able to interpret deviant sentences.... In fact, much of the 
understanding of certain kinds of poetry is based on this ability to find 
interpretations for grammatically unusual constructions—an activity which can 
be especially pleasurable [pp. 4-5].” Chomsky (1965), too, recognized the 
need to account for interpretations given to expressions (not sentences!) which 
violated his intuitive selection restrictions, and expressed the hope that such 
interpretations could be made by adding a (totally unexplicated) process called 
“analogy”: 
[Expressions such as “sincerity may admire the boy”] deviate in some manner . . . from 
the rules of English. If interpretable at all, they are surely not interpretable in the 
manner of the corresponding sentences [e.g., “the boy may admire sincerity”]. Rather, 
it seems that interpretations are imposed on them by virtue of analogies that they bear 
to nondeviant sentences [Chomsky, 1965, p. 76]. 


Note the persistent dichotomy between deviant and well-formed, poetic and 
ordinary, figurative and literal, etc., in these accounts, along with the admission 
that users of the language are able to interpret both kinds of sentences. Is there 
any psychological justification for calling one form of comprehension typical 
and the other special, or for supposing that the processes of comprehension in 
the two cases are different at all? 

Consider the following sentence: 


(6) My aunt is a butcher. 


It is typical to describe two alternative interpretations for such a sentence: one 
literal, the other figurative. For example, I might be saying (a) that my aunt is a 
member of the profession that slaughters animals for market, or (b) that my 
aunt is like such a person in some unspecified way (which would have to be 
inferred in context). But where are the boundaries between sharing some, many, 
most, or all of the properties of “a slaughterer of animals for market”? Dictio- 
naries list several figurative (now standard) uses of “a butcher” which carry only 
part of the sense of “slaughterer of animals for market” or carry a sense which 
overlaps only partially with it: a dealer in meats, one guilty of needless blood- 
shed, a candy vendor on trains, a bungler. The boundary between standard and 
novel uses is difficult to define and changes continually. Dictionaries are jammed 
with metaphors, both dead and dying, and more are arriving every day. 

Does it make sense to say that we comprehend Sentence (6) by first trying the 
standard predicating potentials of “butcher,” only seeking a figurative sense if 
the first effort leads to logical inconsistencies, or a contradiction of some prior 
knowledge (for example, that my aunt is a plumber by trade)? The fact that we 
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seldom notice the potential ambiguity of such sentences is a prima facie, 
although not necessarily compelling, argument against such a two-stage model. It 
is also unparsimonious to assume that some interpretations (the literal) proceed 
like the self-contained “semantic reading” of the linguist, while other interpreta- 
tions (the figurative) must invoke special cognitive, inferential processes which 
have no formal role in linguistic theory. It appears more reasonable to assume 
that a complete account of semantic relations must include psychological vari- 
ables and must include them in the interpretation of all sentence forms. 


THE COMPREHENSION OF METAPHORIC SENTENCES 


In recent years more and more attention has been devoted to the integral role of 
a listener’s cognitive activity in defining the semantic content of words and 
sentences. One important component of this activity is a listener’s inferences 
about a speaker’s intentions, expectations, and beliefs. These variables are part 
of the structure of “speech acts,” the utterances of requests, threats, assertions, 
promises, etc., in social contexts (cf. Austin, 1962; Lakoff, 1972; Searle, 1969). 

There have been several attempts recently to define figurative and literal 
language as two types of speech act which happen to use a common form, for 
example, “an X is a Y” (Gordon & Lakoff, 1971; Mack, 1973; Steinmann, 
1975). The potential ambiguity of a sentence like (6) is resolved by intuiting the 
speaker’s intention. For example, Steinmann (1975) has argued, in the case of 
sentences like (6), that the “predicating potential” of “is a butcher” is realized if 
the speaker intends the literal meaning, but is abandoned if he intends it to be 
interpreted figuratively. Steinmann defines literal usage by the equivalence of 
“intended meaning” and “sentence meaning,” while a figurative use of language 
expresses an intended meaning different from what the sentence itself means. In 
the latter case, the “propositional-act potential” of another sentence is realized 
(for example, “my aunt is cruel”), but which other sentence(s) are intended by 
the speaker must be “inferred” by the hearer “with some guidance from the 
verbal and situational context.” This is the same awkward dualism assumed by 
Katz and Chomsky: the interpretation of literal sentences is straightforward and 
analytic, while a special process must be invoked for interpreting deviant forms. 
The speech act approach is less extreme than the earlier linguistic views since 
there is no pretense that literal “interpretation” is a process independent of real 
speakers and hearers. We intend to speak literally and infer (or perceive) such 
intents in others. 

Even so, the rigid dichotomy between literal and figurative meanings is 
troublesome. In some senses all language use is figurative, since the contexts in 
which linguistic forms are used are always novel. The situations in which use of a 
term is “appropriate” show gradations of similarity; there is no abrupt shift from 
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“standard” situations (defined by a fixed set of criterial properties) to “deviant” 
situations. (Recall the difficulty in bounding the standard uses of “is a butch- 
er.”) Our experience with word use does give us a strong sense of what contexts 
are most typical, and we can exploit this knowledge in interpreting the sentences 
we hear. But the acceptable manifestations of these constraints cannot be 
sharply bounded: expressing novel semantic intentions through familiar forms is 
an integral part of language use. 

Thus it is questionable whether there is an independent level of “sentence 
meaning” which is available to all users. Note that if we refuse to posit such a 
level of meaning, the speech act dichotomy between literal and figurative usage 
disappears, and with it the dual process model. This may be the most sensible 
and parsimonious move from the standpoint of modelling the processes by 
which language is used and comprehended, though it would still be necessary to 
explicate our fuzzy intuitions about “literal” usage. One possible basis for such 
intuitions would be the perception of a strong similarity between a particular 
context of word use and our experience of previous contexts, coupled with a 
tendency to treat concepts as if they had object permanence. 

The processes by which listeners infer the sense of a novel sentence have been 
receiving more and more attention from experimental psychologists (cf. Brans- 
ford & McCarrell, 1974, for a representative summary of this research). Most of 
the experiments to date have been designed to demonstrate the presence of 
inferential processes, that is, to show that subjects comprehend more from a 
sentence or passage than is explicitly provided. In particular, it is possible to 
demonstrate that the process of understanding a sentence can involve more than 
the criteria for literal use of its component words (that is, more than a set of 
normatively-defined semantic features), This is especially important if all lan- 
guage use is to some extent a figurative extension of previous uses. 

Verbrugge and McCarrell (1973) have reported a study of the role of extra- 
linguistic inferences in the comprehension of metaphoric sentences. Metaphors 
are a natural candidate for the study of such processes, since they typically set 
up an equivalence between a topic and a vehicle (what the topic is being 
compared to), but leave the ground (the nature of the equivalence) unspecified. 
Verbrugge and McCarrell reasoned that if this ground is inferred when a meta- 
phor is heard, then a statement of the ground should be an excellent prompt for 
the recall of all or part of the original sentence. In one experiment they 
presented subjects with a series of sentences like (7) and (8), or their topics 
alone (Sentence 9). (The related grounds are presented in parentheses.) 


(7) -Tree trunks are straws for thirsty leaves and branches. (are tubes which 
conduct water to where it is needed) 
(8) Tree trunks are pillars for a roof of leaves and branches. (provide 
support for something above them) 
(9) Tree trunks 
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In a prompted recall task, subjects were given a set of phrases and were asked to 
recall the topic most related to each one. When the phrase was a ground relevant 
to the sentence heard, recall was high; when the phrase was the ground appropri- 
ate to another metaphor involving the same topic, recall was poor. When subjects 
had heard only the topics, either ground produced intermediate recall. This 
suggests that a metaphor leads us to infer a particular predication on a topic. For 
example, the properties attributed to tree trunks in the above metaphors are not 
the sort of thing one would store as semantic markers in a lexicon (as criterial 
features for word use). The properties are part of our tacit knowledge of the 
structure and physiology of trees, and they rely for their activation on what we 
also know about the vehicle. In fact, in many cases the vehicle may lead us to 
understand something new about the topic, to note a property we were unaware 
of before. Thus the sense of a metaphor is not contained in the sentence itself, 
nor is such a sentence illogical and meaningless. Its meaning is a result of our 
inferring a basis for an equivalence which the sentence merely asserts. 

This research leads naturally to the following generalizations about the appre- 
hension of meaning through language: 


1. It is not the elementary linguistic forms which determine meaning, but the 
role they play in a larger relation. (This opposes the assumptions of immanent 
formism.) Thus the meaning of the phrase “tree trunks” in Sentences (7) and (8) 
is not a fixed set of properties, but is defined by the perceived role of tree trunks 
in a context suggested by a particular sentence predicate. 

2. Meaning is intrinsically a psychological relation. (This opposes the positiv- 
istic assumption that meaning can be characterized independent of a compre- 
hender.) Words and sentences take on determinate meaning only when em- 
bedded in a psychological relation. Meaning is a product of our interaction with 
words, it is what we apprehend about the world through the medium of words. 

3. Since only part of the import of a sentence is specified directly, language 
may be viewed as a set of abstract instructions to perform certain kinds of 
inferential and problem-solving procedures depending on the context (cf. Brars- 
ford & McCarrell, 1974, for a development of this position). For example, in 
Metaphors (7) and (8), the copula “are” directs the comprehender to justify an 
equivalence relation at whatever level it is possible and useful to do so. A 
sentence is like a piano score: a rough specification of what is to be performed, 
but abstract and incomplete enough to demand a lot of elaboration. A score 
constrains the pianist’s interpretation, but hardly determines it. Similarly, the 
interpretation of meaning is a cognitive performance guided by sentences: 


It seems that there are certain definite mental processes bound up with the working of 
language, processes of understanding and meaning. The signs of our language seem dead 
without these mental processes; and it might seem that the only function of the signs is 
to induce such processes; and that these are the things we ought really to be interested 
in [Wittgenstein, 1958, p. 3; italics in original]. 
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THE PERCEPTION OF EVENTS 


What activities are involved in the inference or extraction of the ground of a 
metaphor? Aristotle relates metaphor to the “perception of resemblances.” Is 
this itself a metaphor, comparing two separate sets of processes, one linguistic, 
the other perceptual? Or are the processes identical in certain ways? It is worth 
turning for a moment to the traditional domains of perception to see what 
account they give of our detection of similarities among events. 

Classes of events that we perceive as similar or to which we respond similarly 
have usually been characterized by psychologists as “concepts” or “equivalence 
classes.” The concept or class is typically defined in terms of the physical 
properties shared by its members, that is, in terms of their common elements or 
attributes. The properties are particular values along dimensions such as color, 
shape, size, orientation, intensity, frequency, etc. The implicit claim is that all 
classes of objects can be characterized exhaustively by a list of their distinctive 
visual features, auditory features, tactile features, etc. These features are presum- 
ably irreducible components of our environment and provide a sufficient basis 
for constructing our concepts of dynamic events as well as of the objects 
involved in them, Thus a system of “attributive concepts” is essentially the 
Aristotelian theory of types in a new guise, and it is remarkably similar to Katz’s 
system for classifying the word objects which are assembled into sentence 
events. 

Some of the problems with this approach to perception are a product of its 
immanent formism. Three problems are especially relevant to the question of 
perceiving resemblances through language: 


1. The first problem is one pointed out long ago by the Gestaltists: we can 
perceive the similarity of patterns of physical variables, in spite of changes in 
their specific physical composition. A pattern has an integrity which cannot be 
broken down into the attributes of its component parts. For example, a melody 
may be heard as the same in spite of changes in key, tempo, ornamentation, etc. 
The identity of the melody is determined by a particular ordering of its 
component notes; the relations among the frequencies and durations of the 
notes are what is critical. 

2. What seem to be categorical features of our environment may actually rely 
on complex relations among physical variables for their specification. This is 
certainly true of the acoustic basis for the perception of phonemic categories 
(cf. Studdert-Kennedy, 1974), and it is no doubt true of the optical basis for per- 
ceiving putative visual features (hues, shapes, sizes, etc.; cf. Gibson, 1950, 
1966). In other words, when psychologists label certain categories as “features” 
they presuppose (or ignore) a process of categorization which is intrinsically 
relational. We are able to recognize novel instances of a pattern just as we 
recognize novel uses of a term. This is possible in each case because our concepts 
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do not depend on the specific attributes of elements at the lowest level (physical 
parameters, semantic features, etc.; cf. Shaw & Wilson, 1976, for a more 
complete critique of this problem). 

3. The common elements approach implies a “subtractive” process in concep- 
tual abstraction, that is, generalizations are made by retaining only the proper- 
ties shared by all instances, and by discarding all particulars which are not shared 
(cf. Cassirer, 1923). The concept of “metal,” for example, becomes almost 
vacuous because the various instances of metals (gold, copper, silver, tin, etc.) 
have almost no physical features in common. Most of what we understand about 
metals involves structural relations among their components and their inter- 
actions with other substances. Such relational similarities will never be found in 
a search for common features if features are defined by local physical param- 
eters. 


Cassirer’s alternative to the common elements approach is to see concepts as 
abstract with respect to any particular manifestation. Elements are not dis- 
carded, but given an abstract place in a structure, like a variable in a mathemati- 
cal equation. Thus a concept is richer than any particular set of instances, 
because it can “generate” all of them and more by assuming one value after 
another in each of its variables. 

While the faith in local sensations and common properties has continued to 
dominate work in perception and concept learning, interest in abstract relations 
has not been entirely dormant. Michotte (1963), for example, has studied the 
abstract relations that provide the basis for our perception of physical causation. 
In his experiments, one perceives two “segregated” patterns (call them objects, if 
you will) which interact according to certain definite temporal and spatial 
relations. Those relations determine the quality of the event. 

Gibson (e.g., 1950, 1966, 1971) has also devoted considerable attention to the 
abstract character of the physical “invariants” which specify depth, motion, the 
separation of surfaces, types of events, etc. For example, it is the relation among 
texture elements, often over time, which specifies many such properties: a 
density gradient of texture elements may specify depth, the displacement of 
some texture elements and the deletion of others may define a moving edge, and 
so on. Gibson asserts that the form of events is specified by “formless invari- 
ants”—the specific nature of the optical or acoustic energy supporting a relation 
is not important. So, for example, he might describe the events in Fig. 1 as two 
examples of a nonreversible “wiping” transformation, defined by a complex 
spatiotemporal relation between the displacement of one set of texture units 
(the rectangle or semicircle) and the disappearance or breakage of texture units 
“underneath” them. Painting and dusting are examples of this class of transfor- 
mations. Clearly the identity of the materials involved (circles, squares, pat- 
terned wallpaper, dust) is not the critical factor. 
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The approach taken by Michotte and Gibson represents an attack on the 
immanent formism which has dominated theories of perception: the world we 
perceive is not built up out of “concrete” bits of substance, but is composed of 
relations in which elements are embedded. There is a striking resemblance 
between their arguments in the domain of perception and the arguments made 
above against concepts of sentence meaning in the domain of linguistic seman- 
tics. The perceived quality of an event, like the meaning of a sentence, cannot be 
defined solely by the properties of its constituent elements. 

In addition to its immanent formism, there was a second problem with the 
traditional view of language: the idea that the information in sentences could be 
defined in isolation from a particular reader or hearer. The analogous problem in 
perception is whether there is an objective means of defining the physical basis 
for our percepts. 

The Gestaltists, with their fondness for illusions and nonveridical perceptions, 
tended to minimize the role of physical information in determining our percepts. 
Gibson has been wary of their approach, because of its subjective quality and 
because it abandons too quickly the effort to find a level of physical description 
at which perception is veridical. Thus, he has recently proposed the development 
of an “ecological physics” which would substitute abstract relations among 
physical variables for the simpler variables of traditional psychophysics. These 
relations would characterize the basis for our perceptions of environmental 
“affordances” such as graspable, impenetrable, “walk-on-able,” slippery, etc. (cf. 
Gibson, Chapter 3, this volume). A critical problem is whether this move to an 
ecological description of physical information can be accomplished without 
reference to any particular organism or class of organisms. What might it mean, 
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for example, to assert that the information is “available” for detection, but that 
other factors determine whether it is actually “picked up” by an organism? Such 
an approach sounds dangerously similar to Katz’s view that the linguist describes 
available semantic information, while the psychologist studies our actual inter- 
pretation of it. 

There are several reasons why the notion of information available “in isola- 
tion” cannot be rigidly adhered to, any more than in semantic theory: 


1. The physical structure of a class of organisms plays an implicit role in 
defining what is walk-on-able, swim-through-able, dangle-from-able. What is 
graspable by a monkey is not necessarily graspable by a rhinoceros. The informa- 
tion available to two eyes with overlapping visual fields (e.g., disparity) is not 
available when these is no overlap. 

2. The affordance of a physical object, considered in isolation, is ambiguous 
to a single species as well. For example, a culvert under a road could afford a cat 
a tunnel to escape through, a dark place to hide, or an opening through which to 
watch something beyond. The momentary motivational state of the cat actual- 
izes a particular affordance; potential affordances can only be defined with 
reference to such potential organismic states. 

3. The information available to an organism (in a pragmatic sense) will depend 
on its maturational level and the degree to which it has learned to attend to 
various relations (cf. E. J. Gibson, 1969). A basketball affords different things to 
a child and an adult, and a painting affords different things to an art historian 
and a novice. 

4. An organism’s activity is critical in determining what it experiences. Moving 
around a motionless object defines it in a way that a single glance cannot. 
Orienting toward an event behind us, bringing it up close to increase resolution, 
backing off from it to foveate all at once, manipulating it in our hands, etc.: all 
of these involve adjusting to the constraints on our perceptual systems in order 
to pick up information which would otherwise be unavailable. 


Not all of these arguments have a direct bearing on the question of defining 
perceptual information in isolation; some of the arguments involve the processes 
by which organisms apprehend structure (actualize a potential affordance). The 
reason for detailing them is not to deny the importance of a physical level of 
support for perception, or the importance of defining that support in terms of 
higher-order relations. Rather it is to argue against the specific claim that there is 
an organism-independent level of description which will be of any use to a 
psychologist. An event’s “affordance” (like a sentence’s “meaning”) is intrinsi- 
cally a psychological relation. The capacities, motivational state, training, and 
activity of the perceiving individual are essential variables in determining the 
nature of that relation. What we experience is the product of our perceptual 
skills and the physical energies in which we are immersed. 
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FIGURAL THOUGHTS AND FIGURATIVE SYMBOLS 


By this point, it is clear that there are some striking resemblances between 
apprehending events through language and by direct perception. Two important 
ones are these: (a) apprehending events is not a function of the specificities of 
elementary forms, but of the abstract relation in which those forms are em- 
bedded; and (b) the apprehended meaning of an event cannot be defined in 
terms of the properties (even abstract ones) of the composition considered in 
isolation, but is a psychological relation between an active, knowing agent and 
the physical compositions it encounters. 

But there is also a striking dissimilarity between these two modes of cognition. 
The physical composition of an utterance has almost no direct relation to the 
physical composition of the events it may be intended to describe. The relation 
is essentially arbitrary, or “symbolic.” On the other hand, abstract relations are 
directly specified when we attend to them with our perceptual systems, that is, 
the relations are directly and “naturally” instantiated in the relationship holding 
between the observer and the physical composition. Direct perception does not 
require an arbitrary translation step to get from the physics of an event to our 
perception of it. (This is the essence of Gibson’s, 1966, 1971, arguments against 
the need to speak of “cues” and “inference” in perception, and “conventions” 
in pictorial perception.) 

However, a rigid dichotomy between arbitrary and direct specification is 
difficult to maintain. One problem is that there are intermediate cases. The 
sequences in Fig. 1 only specify a dynamic event if one understands the 
conventions of comic strips and interprets the layout accordingly. A similar 
spatial translation of time is found in a piano score, where the order of notes in 
successive lines represents an unbroken continuity in time. Similarly, maps, 
diagrams, and blueprints directly specify certain spatial relations, but their 
interpretation also demands some conventional assignments of what is a gravel 
road, an electrical wire, or a water pipe. Some writers argue for a continuum 
between the conventional and the direct, with written language or mathematical 
notation at one extreme and natural events at the other (e.g., Hanson, 1970), 
while others maintain the traditional distinction between “symbols” and “sign.” 
It is not at all clear what difference it would make whether a particular act of 
comprehension involved a mixture of two types of specification, or a mixture of 
different values along a continuum. 

Jenkins, Jiménez-Pabdn, Shaw, and Sefer (1974) have recently elaborated the 
distinction between “symbolic” (arbitrary) and “figural” (direct) modes of 
cognition in developing a model for aphasia. However, their scission of cognition 
into two modes is only partial. They argue that our understandings and inten- 
tions vis-à-vis the world are abstract with respect to either mode of expression, 
and that what the aphasic has lost is the ability to apprehend and express 
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propositions by symbolic means. The aphasic has not lost the ability to perceive 
events or communicate intentions by figural means. The case of the composer 
Rayel provides a striking example (Alajouanine, 1948; Bogen, 1969). Ravel 
became aphasic at the height of his career and lost not only his capacity for 
speech, but his ability to read and write musical notation as well. However, he 
was still able to sing, play a tune by ear, and compose. Thus his “musical 
thinking” was intact—what he had lost was his ability to interpret arbitrary 
specifications of musical relations. 

The analogy between sentences and musical scores introduced earlier in the 
chapter now takes on more substance. First, it suggests that there may be an 
identity between an abstract relation as specified by physical energy (for 
example, a melody as a dynamic pattern of acoustic energy) and as specified by 
symbolic forms (for example, a melody on a musical score). Moreover, the 
figural specification (the sound pattern) seems to be the basis for the abstract 
relation. When we interpret a score we translate from the arbitrary figural 
properties of the printed page into the figural patterns of sound (whether we 
play on a piano or our vocal cords). This is reminiscent of Russell’s claim that 
our “knowledge by description” is based on our “knowledge by acquaintance” 
(cf. Jenkins et al., 1974). Understanding of music as an acoustic pattern must be 
a prerequisite to learning to use musical notation successfully. As Ravel’s case 
indicates, understanding of musical relations can continue to function in the 
absence of notational skills, but it is difficult to conceive of the opposite being 
true. 

A second possibility raised by the sentence-musical score analogy is that one 
can speak of “figural thinking” which operates not in symbolic terms, but in the 
medium of the abstract relations of direct perception and action. (Recall that 
Ravel could not think in notational terms, but could think creatively (compose) 
in the medium of musical relations.) These processes of figural thought can be 
placed under the control of a symbolic specification (as in sight reading music or 
listening to a story), but hardly depend on it for their existence. Finally, these 
figural processes may be the same ones involved in our imaginative elaborations 
on what we see and hear.” 

These possibilities have some fascinating implications for the understanding of 


? Throughout this chapter the term “inference” has been used fairly loosely. The reason 
for this can now be summarized. It may be that many of our novel expectations and insights 
are the product of figural thought. Thus it is not necessary to assume that all novel but 
appropriate cognitions are the product of symbolic propositions. Inferential activities in 
other animals (e.g., chimpanzees, Köhler, 1931; rats, Tolman, 1932) may be viewed as 
examples of a kind of figural thinking which is basic to human cognition as well. (Cf. 
Arnheim, 1969; Bogen, 1969; Gibson, 1971.) Finally, the inclusion of figural thinking may 
make it possible to ascribe much of what appears “inferential” to the direct perception of 
structure at a higher level. 
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metaphoric sentences, and for the “perceptual” bases of sentence comprehen- 
sion in general. 


ANALOGIES AND THE IMAGINATION 


Consider the following analogy: 


(10) An empty prison cell is like a Venus flytrap, waiting for its next victim 
to enter. 


The sentence asserts a resemblance between two things, but what is the resem- 
blance and what are the two things? Clearly the fragments “an empty prison 
cell” and “a Venus flytrap, waiting for its next victim to enter” share little in 
terms of their physical composition. There are no words in common, nor are 
there any words (besides “a” and “an”) which might even be construed as 
synonyms. At the grammatical level we note that each is a noun phrase, but that 
only reflects the very abstract fact that we are comparing two things or events. If 
we consider the two objects, a prison cell and a Venus flytrap, again there is 
little physical resemblance to be found. A clue is provided by the final clause: 
“waiting for its next victim to enter.” This leads us to elaborate an event given 
what we know about the behavior of a Venus flytrap when something “enters” 
it: if a fly happens to land inside, the plant will quickly close to trap the fly, and 
then begin slowly to consume it. 

At this point we will still be frustrated if we search for common elements, 
because the addition of a hapless fly to the flytrap does not make it resemble a 
prison cell (in terms of superficial form) any more than it did before. But the 
entry of the fly does bring a dynamic relation to our attention: the flytrap is a 
cavity which can shut to form an enclosure. Now we have our first resemblance: 
a prison cell is also a cavity which may be sealed by shutting and locking the 
door. The fly’s activity makes a second dynamic relation available: the closure of 
the cavity follows the entry of a smaller object. Again our knowledge of prisons 
allows us to complete the resemblance on the other side: a new prisoner passes 
through the open door of a cell and is locked inside. If we wish to go on we can 
seek even further resemblances. Consider the fly’s fate, which we might ab- 
stractly call an example of progressive destruction of integrity. With little effort 
we can instantiate this dynamic relation in the case of the prisoner: his health, or 
personality, or “spirit” may deteriorate over time. In general, what we are doing 
is creating a more and more elaborate event beyond the tiny fragment “a prison 
cell,” a process guided interactively by our understanding of both prisons and 
Venus flytraps. 

Several aspects of this process of discovering resemblances are worth making 
explicit. 
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First, note that the resemblances have nothing directly to do with the shape of 
the words, their associations and synonyms, the grammatical structure of the 
sentence, or the objects explicitly referred to.” The resemblances are abstract 
relations (mainly dynamic ones) shared by two complex events. That is, the 
resemblances are essentially figural-they could potentially be specified by a 
worldly event as easily as by the events we imagine at the bidding of the 
sentence. For example, anyone who has mastered the conventions for scanning 
comic strips will perceive in each sequence of Fig. 2 most of the dynamic 
relations we discovered in Sentence (10). A movie would project these relations 
even more naturally and directly. Best of all, of course, would be watching a 
flytrap as it snares a victim. 

Without a description of abstract relations based in figural pattern perception, 
one would be hard put to explain the resemblance between Sentence (10) and 
Fig. 2, or between Sentences (10) and (11): 


(11) An enemy fortress is like the open mouth of a killer whale, when a fish 
accidentally swims in. 


Clearly the mode of specification is not critical, nor are the elements which 
manifest the relation, in themselves, important. If we searched for the common 
elements of prison cells, circles, Venus flytraps, fortresses, semicircles, and killer 
whales, our cognitive cupboards would surely be bare. 

Second, note that the analogy in Sentence (10) guided our attention to certain 
characteristics of an empty prison cell by comparing it to an event in which 
those characteristics were fairly salient. Compare our understanding of “an 
empty prison cell” in Sentence (10) with what we understand from another 
analogy involving the same topic: 


(12) An empty prison cell is like seeing a room through Venetian blinds. 


When we seek an interpretation for this sentence, the resemblance we discover 
will be something like “a small enclosure seen (by someone) through a striped 
pattern which alternately allows and obstructs vision.” In this case we have 
embedded the prison cell in quite a different event than the first time. In effect, 
Sentence (12) asks us to imagine someone (preferably ourselves) standing out- 
side a prison cell and looking in through the bars. 

But consider what this imaginative act entails. Not only are we “attending” to 
a very different kind of relation in this case, but we are “orienting” in ways that 
remind us of an organism’s activity in an environment. Sentence (12) asks us to 
perform operations similar to those we perform when perceiving an event 


*Thus, there are obvious limitations on the usefulness of approaches to metaphor based 
on word associates, word clusters, or implicit meaning-unit responses (e.g., Brown, Leiter, & 
Hildum, 1957; Koen, 1965; Osgood, Suci, & Tannenbaum, 1957), or approaches based on 
the contents of images (e.g., Paivio, 1971). 
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directly: it asks us to adopt a station-point just outside a prison cell, to orient 
toward the cell, and to look into it, beyond the bars themselves. Thus, our 
orientation to the event is more critical to the sense of the metaphor than the 
orientation of the bars on an empty prison cell or the slats on Venetian blinds 
(in fact they are vertical in one case and horizontal in another). Recall that these 
activities of moving about the environment, orienting toward things, and looking 
are exactly what Gibson considers integral to normal perception. This implies 
that the basis in perception for the understanding of symbolic analogies may 
involve more than an identity of abstract, figurally-based relations (as so far 
defined). If perception is more than the passive apprehension of structure, then 
we can expect that analogies will also activate the relations underlying our 
actions in the world. Seeing anything in a room by looking through Venetian 
blinds requires that we be as close to the window imaginatively as we would have 
to be in real life. If this is true, we will need a broader conception of 
“figurally-based” relations which incorporates both our perceptual activity and 
the dynamics of events. 

A third aspect of our comprehension of a sentence like (10) is that we fill in 
events which are only crudely or incompletely specified in the sentence. That 
the “victim” was a fly in one event and a prisoner in the other had to be 
supplied from our knowledge about the things that were specified. We also had 
to supply information about flytraps’ behavior, prison cell doors, the fate of flies 
captured by flytraps, etc. Once a relation was perceived in one domain, it was 
like a mathematical equation which could be used to generate events in the 
second domain by supplying appropriate particulars for its variables. Where this 
was possible, we accepted the relations as part of the sense of the analogy. But 
if, for example, we sought an instantiation of horizontal stripes in the context of 
a prison cell, Sentence (12), we would probably fail and thus be forced to seek a 
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sense for that analogy which was abstract with respect to orientation. Finally, 
note that the generation of elements in the second domain relied on the 
abstractness of our concepts. The process would be very awkward to explain by 
a common elements approach. 

The processes of intermingling two domains are clearly related to the use of 
models and analogies in scientific theories. Hesse (1966), for example, describes 
“theoretical explanation as [a] metaphoric redescription of the domain of the 
explanandum” (that which is to be explained). In part, this amounts to using 
familiar relations in the domain of the vehicle to isolate and describe properties 
of the domain under investigation. As we have seen, the power of this approach 
is that once some part of the structure of the vehicle is seen to be shared by the 
events under study, we may fill out our understanding of the topic domain by 
imagining entities and generating predictions consistent with that structure. The 
relations supplied by the vehicle are indispensible in this process. Thus scientific 
theories are a form of “knowledge by description”: creating and understanding 
scientific assertions depends on our “knowledge by acquaintance” and is there- 
fore rooted in our understanding of figurally based relations. In fact, it is likely 
that most of what we learn through language is a result of analogic references to 
the relations made available by our earlier experience. 


SOME EXPERIMENTS ON ANALOGIC SENTENCES 


A recent series of studies by the author (Verbrugge, 1974) demonstrated many 
of these phenomena with sets of analogies in simile form. In some of the 
experiments, subjects heard either (a) a list of analogies which predicated a set of 
topics with one set of vehicles, e.g., Sentence (10), (b) a list of analogies 
predicating the same topics with different vehicles, e.g., Sentence (12), or (c) a 
list of the topics alone, e.g., “an empty prison cell.” As in the Verbrugge and 
McCarrell (1973) study, subjects’ recall for the sentences was prompted with a 
set of cues. In these experiments, however, the prompts were a second set of 
analogies (each specifying the same relations as a target sentence), rather than 
verbal descriptions of the shared relations. Both symbolic and figural analogies 
were used as prompts to explore whether subjects inferred relations which were 
abstract with respect to the mode of specification. For example, recall of 
Sentence (10) was prompted with either Sentence (11) (a symbolic analogy) or 
Fig. 2 (a figural analogy); both of these prompt analogies were designed to 
specify essentially the same relation as that specified by the acquisition sentence. 

When subjects were given a set of the symbolic prompts in a paced prompted 
recall task, they were able to recall an average of 69% of the target sentences 
when the prompts specified relevant relations. This was close to the maximum 
degree of correct pairing to be expected under any conditions (82%, estimated 
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by an unpaced matching task) and was significantly greater than the number of 
sentences subjects were able to produce in free recall (47%). The effectiveness of 
the symbolic analogies as prompts is remarkable, since there were no lexical, 
referential, or syntactic similarities to guide recall. The only available basis was 
the similarity of an abstract relation among the objects referred to in the 
sentences. In neither the acquisition sentences nor the prompt sentences was this 
relation made explicit; it was only available if subjects elaborated the topical 
domain (from a perspective suggested by the vehicle) and discovered the appro- 
priate resemblance. This distinctive inferred relation was apparently more central 
to subjects’ experience (and memory) of a sentence than the presence of 
distinctive “immanent forms”: symbolic analogies which specified relevant rela- 
tions were significantly better as prompts than distinctive content words chosen 
at random from each of the target sentences. 

When subjects’ recall was prompted with figural analogies, the effectiveness of 
the prompts varied depending on the subjects’ task during acquisition. When 
subjects wrote verbal descriptions of the shared relations or simply listened to 
the sentences, prompted recall was relatively poor (38% and 30%, respectively). 
However, when subjects drew sketches of the relations (the kinds of patterns and 
events described in the sentences), they later recalled an average of 62% of the 
sentences. This is close to the number of sentences subjects were able to pair 
with the figural analogies in an unpaced matching task (69%). As with symbolic 
analogy prompts, the effectiveness of the figural analogies can only be ascribed 
to their specification of relevant relations among the objects depicted. Recall 
could not be ascribed to thinking directly of the original objects themselves, 
since subjects who viewed the figural analogies without having heard the acquisi- 
tion sentences very seldom thought of the objects mentioned in the related 
sentences. For example, only 9% of subjects viewing Fig. 2 thought of Venus 
flytraps and 2% thought of prison cells. 

The power of a vehicle in guiding the discovery or activation of relations was 
also demonstrated with both types of prompts. When subjects had heard irrele- 
vant predications on a topic or just the topic alone, the prompts produced only 
about 10% recall even though they specified relations which were true of the 
topic.* For example, Sentence (11) and Fig. 2 were poor prompts for “a prison 
cell” when subjects’ attention had been drawn to the (abstract) affordances of 
barred apertures by Sentence (12). 


4In contrast to the Verbrugge and McCarrell (1973) study, prompted recall of the topics 
alone was not intermediate between recall of relevant and irrelevant sentences. The reason 
for this is probably that the grounds of the earlier study tended to be more salient as 
properties of the topics (considered in isolation) than the grounds in the analogies of the 
present study. Thus subjects in the earlier study would be more likely to think of the 
relevant relations during their acquisition experiences and (on average) be better prepared to 
recognize the relations in the related prompts. 
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Overall, figural specifications of relations were found to be as effective as 
symbolic specifications in prompting recall of abstractly similar events. The most 
parsimonious explanation is that the same relation can be specified by both 
modes, and that the prompting of recall need not be mediated exclusively by 
either mode. In particular, it is highly unlikely that the recall with either mode 
of prompts was mediated by verbal associates or shared linguistic features. In 
fact, articulating the ground of an analogy in a sketch (a figural activity) was 
superior to articulating it in words (a verbal or symbolic activity). (This superior- 
ity was found for free recall as well as for figural prompted recall.) If anything, 
the processes which mediate recall seem to have a more natural affinity with the 
figural mode. It is difficult to avoid the conclusion that the comprehension of 
analogic sentences involves imagining events in a medium of abstract, figurally- 
based relations. 

Support for this conclusion was found in a set of experiments on “mixed- 
mode analogies,” that is, analogies with verbal topics and figural predicates. For 
example, subjects saw the topic “an empty prison cell is like . . .” displayed on a 
projection screen, followed by a short animated cartoon of a schematic event 
like one of those in Fig. 2. The expectation guiding these experiments was the 
following: if it is true that the verbal predicates of analogic sentences guide 
attention to certain figurally-based resemblances, then figural predicates should 
be at least as effective in guiding attention to the relevant relations. This 
expectation was confirmed; in fact the relevant symbolic and figural analogy 
prompts were more effective in producing recall of the topics than they had 
been when subjects listened to analogic sentences (74% with symbolic prompts, 
89% with figural prompts). The figural predicates probably focus subjects’ 
attention more sharply on the relations specified by the prompts, while the 
symbolic predicates tend to be less constraining. However, the difference be- 
tween the two modes of predication was minor compared to the difference 
between relevant and irrelevant predications. Irrelevant prompts produced recall 
of only 14% of mixed-mode analogies’ topics. Thus figural predicates can affect 
a comprehender’s access to certain relations as powerfully as symbolic predi- 
cates. The perception of resemblances seems to be independent of the particular 
mode of instantiation as well as of the particular objects supporting the shared 
relation. 


METAPHOR: THE QUICK AND THE DEAD 


How does the vehicle exert its influence on our understanding of a topic? In the 
research just reported this influence was unmistakable, but the design did not 
allow a test of how subjects inferred or abstracted the shared relations. The 
process might be thought of in terms of one of two perceptual metaphors: either 
as a back-and-forth scanning of two juxtaposed events, or as a kind of fusion of 
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the two domains into a single event. Though this suggestion is certainly speculative, 
the distinction has considerable intuitive appeal and has many analogues in 
literary criticism and the philosophy of science.° 

The first approach is that assumed in the earlier analysis of sentence (10): the 
events are elaborated separately and a relation found in one is sought in the 
other. It is a process similar to looking back and forth at two pictures or cartoon 
sequences set side by side (e.g., Figs. 1 and 2), scanning for resemblances. This is 
the kind of thinking encouraged by similes, especially those which are essentially 
equalities between two events requiring no further elaboration; for example, 


(13) A snowflake is like an asterisk. 


Since juxtaposition leaves the topic uncontaminated by the vehicle, such anal- 
ogies are best fitted to the Aristotelian view that metaphor merely decorates 
meaning. 

The second approach is more difficult to describe, but is probably more 
important. It involves a fusion of both events, and thus a transformation or 
warping of each domain according to the particular constraints of the other. In 
the case of Sentence (10), for example, we may imagine the prison cell trans- 
formed into the shape of a Venus flytrap, but on its side, since a prison door 
opens that way. The cell may have an animate quality to it: it is “waiting”; you 
can almost hear its mouth watering. (This suggests there is a second vehicle 
beyond the first: a hostile animal, like a wolf—in fact the Venus flytrap is a 
fascinating plant because it is so “animate.”) A fly enters the cell in the guise of 
a prisoner, the door folds shut, and the prisoner-fly begins to dissolve on the 
floor. 

This interpretation is exaggerated and bizarre, but it is not an uncommon kind 
of thing to do, especially in response to metaphors which assert an identity 
between two things. For example, the sentence 


(14) Skyscrapers are the giraffes of a city. 


asserts that skyscrapers are giraffes; in fact, the response of many people to it is 
to visualize a huge giraffe in the middle of a city skyline, its neck extending far 
above the “other” buildings. What this process seems to entail is that each 
domain is operated on in parallel, and the more they are found to share the more 
they will fuse. Metaphors in which little is shared (especially mixed metaphors) 
are likely to lead to awkward and motley fusions. 

It is not immediately obvious which of these processes is intrinsically more 
powerful in guiding our perception of resemblances. Fusion seems to involve a 
more imaginative transformation than simple juxtaposition, yet it can also 


For example, Coleridge made a distinction between Fancy, “a power of assembly or 
collocation, involving simply the mechanical noting of resemblances,” and Imagination, “a 
unifying ‘esemplastic’ power” (Hawkes, 1972, p. 47). 
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produce bizarre and distracting hybrids. The juxtaposition view seems most 
compatible with those philosophers of science who view an analogy as a kind of 
temporary instrument, discovered in an unaccountable flash of insight, exploited 
for its use in generating and testing hypotheses, and then discarded when the 
theory has been formalized in its own right. The acceptance of a theory, and the 
discovery of facts which confirm or disconfirm it, are public processes in which 
private metaphors and fantasies play no part. If Rutherford prefers to visualize 
atoms as tiny red billiard balls, it may be of anecdotal interest, but it is irrelevant 
to the symbolic equations which represent the nature of particle interactions. 

An alternative view, closely related to the fusion process, is that the vehicle is 
integrally related to the origin, understanding, use, and acceptance of a scientific 
theory. People who have defended this view have labelled the domain of the 
vehicle in various ways: a “root metaphor” (Pepper, 1942), a “paradigm” 
(Kuhn, 1970), a “metaphoric redescription” (Hesse, 1966), a “world view,” and 
so on. The approach is inevitably “perspectivistic.” Facts are an illusion fostered 
by the positivists and immanent formists. Our understanding of an event is 
guided by our prevailing metaphor. Like the prison cell, the domain we wish to 
understand is transformed according to the properties of the metaphoric vehicle. 
Thus the perspectivist claims that a symbolic formalization of a theory never 
exhausts its meaning—symbols only take on meaning and explanatory power 
when interpreted (by a someone). In fact, a long debate has raged in the 
philosophy of science over whether different theories with common formaliza- 
tions are equivalent and whether the “surplus meaning” associated with formal 
entities in theories can be excluded when the theories are evaluated (e.g., 
Carnap, 1956; Hanson, 1966; MacCorquodale & Meehl, 1948). 

In part, this debate involves the question of whether a word or theory 
maintains its power of specification when its metaphoric basis is no longer 
active. Metaphors may be ignored, may die, may fade away. For example, we 
now speak of “chair legs” without activating the original metaphoric domain 
(people’s legs), but the metaphor was very much alive to the Victorians who 
insisted that, in all modesty, chair legs ought to be covered by “skirts” which 
extended to the floor, When a metaphor is alive, it is rich with potential for 
novel inferences. When it has died, the old vehicle is left with only a narrow 
power of specification (as in the case of chairs’ legs), or it becomes altogether 
meaningless in the present context (as in the case of “phlogiston” or “animal 
spirits”). As Pepper (1942) writes: 


Concepts which have lost contact with their root metaphors are empty abstractions. 
When a world theory grows old and stiff..., men begin to take its categories and 
subcategories for granted and presently forget where in fact these come from, and 
assume that these have some intrinsic and ultimate cosmic value in themselves [p. 113]. 


Thus the fate of metaphor in a general language community must be the same in 
a scientific community as well. Due to excessive formalization or to a paradigm 
shift, the metaphoric domain of a theory may no longer be activated by its users. 
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The relations previously discovered may continue to generate research, but 
without the transforming potential of the vehicle, they ossify and cannot in 
themselves be productive of new understanding. A formalization, in itself, is 
lifeless—like the words in a sentence or the notes in a piano score. It cannot 
encapsulate meaning. A theory requires interpretation, and thus is open to 
changes in interpretation. 

All too often the formalist attitude leads us to believe that theories may be 
made completely explicit, or that the meanings of terms may be publicly 
codified into dictionaries. In the growth of a science, of a community, or of a 
person, metaphoric understanding is a dynamic process, which makes continual 
access to prior, figurally based experience. It is a fallacy to think that a symbolic 
description can capture the essence of this understanding. And if understanding 
is a process of change, then it is no more helpful to consider meaning as a 
closed system of standard senses than it is to look at a single snapshot of a 
dynamic event: 


A dictionary is a frozen pantomime. Our problem is only beginning when we consider 
the pale flowers of the ‘nosegay of faded metaphors’ that it presses between its pages. A 
semantic theory must account for the process of metaphorical invention—all the more 
so, a theory that stems from generative grammar with its emphasis on creativity . . . It is 
characteristic of natural language that no word is ever limited to its enumerable senses, 
but carries within it the qualification of ‘something like’ (Bolinger, 1965, p. 567]. 


To date, psychology has devoted little attention to analogical processes, perhaps 
out of a feeling that they are “deviant” or “subjective.” This is more than 
unfortunate: there may be no more important human skill than perceiving the 
“something like” through language. 
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How to Catch a Zebra 
in Semantic Memory 


Elizabeth F. Loftus 
University of Washington 


When Class D baseball was in the midst of its waning days, the team from Ful- 
ton, Tennessee rather suddenly changed its nickname from Railroaders to 
Lookouts. It wasn’t that anybody was particularly unhappy with the name 
Railroaders. What happened is that the uniforms of the Fulton team were 
rapidly fraying at the seams, and the slightly more affluent “Lookouts” from 
Chattanooga graciously gave away some of their used uniforms. Since “Look- 
outs” was lettered on the hand-me-downs, the owners of the Fulton team either 
had to change the lettering or change the team name. It was apparently easier to 
change the name of the team, The fact that the “Lookouts” were named after 
the famous mountain overlooking Chattanooga, while the city of Fulton looked 
out upon an absolutely flat area, was something that the owners of the Fulton 
team saw fit to completely ignore. 

Clearly names are arbitrary things. If I and a few friends decide to call “milk” 
by some other name, we can do so, and we do not change the quality of the 
object to which we refer. In fact, we might actually move to France and call it 
“du lait” or move to Spain and call it “leche”; either way we are still referring to 
that “opaque white fluid secreted by female mammals for nourishment of their 
young (Fowler & Fowler, 1964). Attributes, on the other hand, are not 
arbitrary. Milk is white, no matter what we call it. If we try to change its color 
by, say, adding chocolate syrup, we now have “chocolate milk.” We have 
changed the quality of the object. Both the names and the attributes of objects 
that we know are stored in semantic memory. In this chapter, I will argue that 
the retrieval of these two types of information involve different strategies. 
Before presenting this argument, and a resultant model for semantic memory 
retrieval, a brief discussion of some “history” is appropriate. 
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SEMANTIC MEMORY: SOME RECENT RESEARCH 


Although the area of semantic memory has been the focus of a great deal of 
recent research activity, it is still far too early to provide exact details about the 
mechanism by which we retrieve semantic information. One earlier effort was 
aimed at determining the extent to which the retrieval mechanism involves 
successive as opposed to simultaneous consultation of the memory store. The 
successive-simultaneous distinction was thought to be a basic distinction with 
many important consequences: one of them concerns the effect of size of the 
array to be searched on searching time. To the extent that the retrieval process 
involves successive consultation of the memory store, retrieval time should be a 
function of the number of items in the array to be searched. Conversely, to the 
extent that simultaneous consultation (or “parallel processing”) is involved, 
retrieval time should be relatively unaffected by the number of items in the 
array. 

In an attempt to shed light on this issue, a number of experiments were perform- 
ed in which array or category size was a major independent variable. Many experi- 
ments shows that it takes longer to decide that a word is a member of a large 
category than to decide it is a member of a small category (Atkinson & Juola, 1971; 
Landaver & Freedman, 1968; Juola, Fischler, Wood, & Atkinson, 1971; Meyer, 
1970; Tzeng, 1972; Wilkins, 1971) but some did not (Rips, Shoben, & Smith, 
1973; Smith, Haviland, Buckley, & Sack, 1972). Even when an effect of category 
size was observed, a strictly sequential processing model was not usually pro- 
posed. Landauer and Freedman (1968) concluded that the reaction time (RT) 
differences observed for scanning large versus small categories was too small 
to be interpreted as evidence for the existence of a successive scanning process. 
Landauer and Meyer (1972) proposed a “category-search” model, but made no 
specific assumptions about the order or manner of search, that is, whether it was 
serial or not. Tzeng’s (1972) retrieval mechanism incorporates both parallel and 
Successive processes—successive draws and parallel scanning within the items of 
each draw. Collins and Quillian (1969) proposed a complex network model. 

Freedman and Loftus (1971) departed from the previous work that had 
studied identification time and instead measured the speed with which a subject 
could actually produce a word himself. Instead of giving the subject a stimulus 
and asking him to decide whether it was a member of a category, he was given a 
category and asked to produce a word that belonged in it. For example, a 
subject who was asked to name a fruit beginning with the letter P might say 
peach, pear, or plum, among other possibilities. Because category size had no 
effect on RT, a sequential scanning process seemed unlikely, and instead a 
hierarchical network model, fashioned after that of Collins & Quillian (1969), 
was proposed. Since that time, other network models of human memory have 
appeared (Anderson, 1972; Crothers, 1970; Rumelhart, Lindsay, & Norman, 
1972) indicating the popularity of the approach. In network models, human 
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memory is conceived as a giant network of interconnected nodes. The nodes 
usually correspond to individual concepts, ideas, or events in the system. The 
links or connections between the nodes specify particular relations that hold 
between the concepts. To illustrate, the system might have the concepts DOG 
and ANIMAL which would be represented as distinct nodes with a connection 
between them specifying that a DOG is a member of or a subset of ANIMAL. 

In the next three sections, I will discuss (1) the original model proposed by 
Freedman & Loftus (1971), (2) some evidence for the model and some problems 
with it, and (3) the modifications of the model which resulted in a new 
model—the Dictionary—Network Model. 


THE FREEDMAN-LOFTUS MODEL 


No serious investigator of human memory believes that retrieval involves a 
sequential scan through the entire contents of memory. As Tomkins (1970, p. 
63) has noted, such a mechanism would result in a 50-year old man requiring 
somewhere on the order of 400 years to answer the question “What is your 
name?” Similarily the process of finding a particular category member does not 
appear to involve a strictly sequential scan. If the retrieval process does not 
involve successive scanning, of what does it consist? The original model proposed 
by Freedman and Loftus (1971) assumed a hierarchical organization, meaning a 
system that is divided into a number of interconnected subsystems, each of the 
latter being hierarchical itself. Specifically, the basic conception was that of a 
memory store organized into a complex hierarchy composed of categories (e.g., 
animals) with subsets of each (e.g., birds, dogs) and supersets (e.g., living things). 
Within each category, a variety of subsets exist. Some are noun categories that 
are specialized members of the larger set (e.g., birds and dogs are subsets of 
animals). Some are clusters of words that are highly associated for any of a 
number of reasons. They may have qualities in common (e.g., all begin with the 
letter “S” or all have long names or all rhyme). They may be associated with 
each other in the individual’s experience (e.g., all be animals in the Pooh books). 
These clusters are probably more idiosyncratic than the main noun categories, 
but are somewhat consistent across individuals. And finally, under each category 
is an undifferentiated enumeration of exemplars. Within all subsets, clusters, and 
enumerations, the instances will be listed in a more or less constant order, 
according to their frequency in the language or dominance in the category. 

How does a person retrieve a category member; for example, how does he find 
a “yellow fruit”? Freedman and Loftus assumed that the process involved two 
major steps: (1) entering the appropriate category—“fruits” and (2) finding an 
appropriate member of that category—“lemon” (plus, of course, the time re- 
quired to produce that response verbally). Thus, the subject first enters the 
category “fruits,” then he enters a cluster of yellow fruits, and reads out the first 
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entry. If instead he was looking for a fruit beginning with P, the category 
“fruits” would be entered directly, the subset “fruits beginning with P” would 
be found, and the first entry read out. In both cases, retrieval consists of finding 
a main set and a subset, rather than serially searching through an array of 
instances. The Freedman and Loftus (1971) reaction time experiment is not 
only consistent with the general model, including the assumption that the 
memory store is organized primarily into noun categories, but it also provides an 
estimate of the duration of the first step of the retireval process. 

Every stimulus pair (e.g., fruit-yellow, fruit-P) was presented with the noun 
category either first or second with a short interval between the noun and 
restrictor. When the noun comes second, the total retrieval process begins only 
after its presentation. When the noun comes first, the first step of the process 
(category entry) can be begun before the second half of the pair is presented. 
Since reaction times are measured from the presentation of the second half of the 
pair, any difference due to order can be assumed to be caused by that part of the 
process that is completed before the second half is shown. Thus, the difference 
between noun first and noun second is an indication of the existence of step one 
and of its duration. The evidence is as follows: when nouns are presented first, 
the mean reaction time is 1.87 sec; with nouns second, it is 2.12 sec. This 
difference of .25 is quite stable across somewhat different conditions, supporting 
the idea that step 1 consists of entering the noun category, and suggesting that 
the duration of Step 1 is approximately .25 sec. 

To summarize, Freedman and Loftus presented a hierarchical storage model, 
and proposed that retrieval from semantic memory consists of two distinct steps 
(entering the appropriate category, and then finding an appropriate member of 
that category). The general model is consistent with the lack of relationship 
between category size and RT, and is given some support by the difference in 
RTs when the noun is presented first and second. That difference of .25 sec also 
serves as an indication of the duration of Step 1 of the retrieval process. 


THE FREEDMAN LOFTUS MODEL: 
THE CASE FOR AND AGAINST 


Support for the Model 


Evidence on various aspects of the model can be found in several recent 
publications. For example, Loftus (1972) provides evidence that the semantic 
memory store is organized primarily into noun categories. In her experiment, 
200 subjects were presented with 20 nouns and 20 adjectives. If the stimulus 
word was a noun category, subjects were told to write any item that belonged in 
that category (e.g., seafood—oyster, clam, fish). For adjectives, the subjects were 
told to write any item which had the inherent quality of that adjective (hard— 
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brick, rock, wood, etc.). The mean number of responses given to noun stimuli 
was greater than the mean number of responses given to adjective stimuli. More 
detailed analyses of the response protocols indicated that when the stimulus was 
a noun almost all of the subjects’ responses were instances of that noun 
category. There was some tendency to group together category members which 
have qualities in common. For example, Sweden, Norway, and Finland (all 
Scandinavian countries) often occurred together in the response protocols for 
“country.” It appears that within many categories, a variety of subsets often 
exist—subsets whose members are highly associated perhaps as a result of the 
fact that they have these qualities in common. In contrast, when the stimulus 
was an adjective, subjects tended to name a number of items from one noun 
category, shift to another category and name a number of items from that 
noun category, shift to another category and name a number of items from that 
category, shift again, and so on. An actual sample protocol for the stimulus 
“small” best illustrates this point: “Fly, bee, ant, eye, ear, toe, finger, grape, 
cherry, strawberry.” The subject named three insects, four body parts, and three 
fruits. Apparently the reason the subjects name fewer items in the case of 
adjectives is that they used time shifting from one category to another. In the 
case of nouns, any “shifting” which may occur is confined within the main noun 
category itself. 

There exist additional empirical and theoretical arguments for noun priority. 
For instance, Marshall and Newcombe (1966) observed a dyslexic patient who 
had great difficulty reading verbs, adjectives, and prepositions, but could read 
nouns quite well. Miller’s (1969) data is also relevant. He asked judges to sort 
words on the basis of similarity of meaning. Results indicated that judges sorted 
first on the basis of part of speech (nouns with nouns, verbs with verbs, etc.), 
then within each syntactic category they proceeded to sort on some other basis. 
Nouns were sorted on the basis of their class terms (for example, knight and 
mother are both persons). The real point of interest if that people put verbs 
together, not on the basis of their meanings, but on the basis of nouns that could 
be used with them. Adjectives, too were frequently classified on the basis of the 
nouns they could modify. At a theoretical level, the work of Chomsky (1965) 
also suggests some priority for nouns. When substituting words into phrase 
markers, nouns are selected independently and verbs are selected in terms of 
those nouns. Chomsky shows that serious complications in grammar would arise 
were one to first insert verbs by means of context-free rules, and then select 
nouns in terms of the verbs. 

All of this evidence suggests strongly that nouns have some sort of priority, 
that they are somewhat more accessible than other types of lexical items. The 
assumption that the semantic memory store is organized primarily into noun 
categories therefore seems reasonable. Having made the assumption, other types 
of memory phenomena fall into place. Category clustering is a case in point. Ifa 
person is asked to recall a list of instances of various categories such as “dog, 
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turnip, horse, carrot...” in any order he wishes, his recall will be clustered by 
categories. The order “dog, horse, turnip, carrot ...” would quite probably be 
observed (Bousfield, 1953). If a subject first entered an area of memory which 
contained a particular category, then searched the category for recently observed 
items, then switched to another category, etc., clustering is exactly what would 
be expected. 

Without going into great detail, it is worth noting that other aspects of the 
model have been empirically supported. Loftus, Freedman, and Loftus (1970) 
have provided additional evidence that each category in the hierarchy can be 
entered directly, whereas Loftus and Freedman (1972) further support the 
notion that the retrieval process consists of category entry as the first major 
step. 


Problems with the Model 


Despite the supporting evidence, there are several difficulties with the model. 
First, the notion that so many clusters exist—that every category contains 
separate clusters for each letter— seems implausible. It is true that when asked to 
name fruits beginning with the letter P, the subjects have little difficulty in 
rapidly producing many correct responses. That is, they generally do not say 
“peach,” pause and then say “pear,” pause and then say “plum.” Further, the 
typical pattern we have found in informal testing is a spurt of several responses, 
perhaps followed by a pause and then several more responses, etc. While this 
might suggest that these words are clustered in some sort of functional relation- 
ship, there are other examples where this “clustering” or rapid responding does 
not occur. Subjects who are asked, for example, to name articles of furniture 
that begin with the letter C do typically say “chair,” pause and then say “chest,” 
pause and then say “couch.” 

A second reason for modifying the model was alluded to in the opening 
paragraphs of this chapter regarding the distinction between names and at- 
tributes. The model, as it stands, assumes that the process of retrieving an 
“edible bird” is functionally equivalent to the process of retrieving a “bird 
beginning with the letter C.” To find an edible bird, according to the model, the 
category “birds” is entered, the subset “edible birds” is entered, and the first 
entry is read out. Analogously, to find a bird beginning with the letter C, 
“birds” is again entered, the subset “birds beginning with C” is entered and the 
first entry is read out. In both cases retrieval consists of finding a main set and a 
subset. Nowhere in the model is a distinction made between a stimulus which 
restricts the type of object (the bird must be edible) and a stimulus which 
restricts the name of the object (the bird’s name must begin with the letter C). 
“Edible” is an attribute of some birds “beginning with the letter C” is a 
characteristic of the names of some birds. Searching for an attribute may not 
involve the same process as searching for a name. For years philosophers have 


14. HOW TO CATCH A ZEBRA IN A SEMANTIC MEMORY 399 


emphasized the distinction between attributes and names (see for example, 
Quine, 1960), and more recently pilot data collected in our laboratory suggest 
that different strategies are involved in the search for an attribute versus the 
search for a name. 

The pilot data in question resulted from an experiment in which subjects were 
shown a noun category paired with either an adjective or a letter, and asked to 
produce a word that fit the restrictions imposed by the two. In one condition 
adjective and letter trials were randomly intermixed (the “mixed” condition); 
this condition serves as a replication of Freedman & Loftus (1971). There were 
also two “blocked” conditions. In one, subjects saw only noun categories paired 
with adjectives (i.e., only adjective trials), while in the other blocked condition 
subjects saw only noun categories paired with letters (that is, only letter trials). 

We found that the time to respond on an adjective trial was independent of 
whether the subject knew he would be seeing an adjective, that is, the RT on 
adjective trials was equal in mixed and blocked conditions. However, a subject’s 
responses on letter trials were dramatically affected by his expectations; he 
responded significantly faster in the blocked than in the mixed condition. 

Without making additional assumptions, the original model has difficulty 
handling these results. In addition, the results argue strongly for the point we 
have been making: There is a difference between an adjective and a letter trial, 
between a question which restricts the type of object and one which restricts the 
name of the object. Two types of strategies appear to be possible: The strategy 
for adjective trials is not influenced by advance warning that an adjective is 
coming, whereas the strategy for letter trials is so influenced. Clearly, a model of 
retrieval should reflect these differences. 


THE DICTIONARY-NETWORK MODEL 


We shall consider here a new model which has a somewhat changed conception 
of the memory structure and an elaborated conception of the retrieval mecha- 
nism. The new model incorporates the fact that when a word is stored in 
semantic memory, at least three kinds of information about that word are 
included: semantic, phonemic, and orthographic information. This is a desirable 
feature since we know that subjects use these three kinds of information in 
attempting to search their semantic memories to find a particular word (Brown 
& McNeill, 1966). 

The major structural change includes the addition of a “dictionary,” which is 
simply a listing of the lexical items of the language (see Fig. 1.) As to the 
question of what constitutes a lexical item, we realize that it could be a word, a 
morpheme, an idiom, or some other unit, but we do not attempt to decide that 
issue here. As to the question of whether the listing is ordered (it could be 
ordered alphabetically, or according to frequency of usage, or age of acquisition 
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FIG. 1. The dictionary network model. 


or in some other way) we do not attempt to decide that issue here. This 
dictionary is of a very special kind: it includes both phonemic and orthographic 
information about each of its entries,’ and in addition, each entry serves as an 
address telling where semantic information can be found. Thus, the system is 
content addressible; each entry provides direct access to the portion of semantic 
memory that represents the meaning of that entry. Our assumptions about the 
organization of semantic information are similar to our earlier formulations. We 
retain the notion of a hierarchical organization composed of categories (e.g., 
animals) with subsets of each (e.g., birds, dogs) and supersets (e.g., living things). 
We retain the idea that within each category there are clusters of items that have 
qualities in common (e.g., large animals, yellow birds). We no longer assume the 
existence of clusters of items that have phonemic attributes in common (e.g., all 
animals beginning with the letter B). Information about the sounds and spelling 
of items is contained in the phonetic-orthographic dictionary. 

To illustrate, the word animal might receive an entry in the dictionary as 
diagrammed schematically in Fig. 1. Since the dictionary itself contains 
phonemic and orthographic information about the word animal, we can use this 
information to answer questions like “what letter does the word animal begin 
with?” or “How is the last syllable of the word animal pronounced?” To answer 
a question about the meaning of a word, it is necessary to consult that part of 
the semantic store which contains this information. Figure 1 presents a fragment 


'To accompany a dictionary, one additional component is needed, namely a set of rules 
for governing its use. These rules would allow us to know what is likely to be listed in the 
dictionary and what is not. For the moment we will not be concerned with these rules, since 
in our experiments all inputs are “allowable.” 
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of a hierarchy that might be encountered as part of someone’s semantic memory 
structure. We can use the information in the semantic network to answer 
questions like “Is a bird an animal?” or “What is the name of an edible bird?” 

Using this as a general model, we can begin to describe how a subject manages 
to find an answer to the kinds of questions he is asked in our experiments. How 
does he find (1) a category member to which a particular adjective is applicable 
(e.g., edible birds); and (2) a category member beginning with a particular letter 
(e.g., an animal beginning with “D”). 

Consider case (1) in which the subject must name an edible bird. We now 
assume that the retreival process has at least three major steps. In keeping with 
our earlier view, the first step consists of entering the appropriate category (e.g., 
“birds.”) The next step consists of finding an appropriate cluster of items and 
picking a member of it. Once this member is found, the third step involves 
looking up the “name” of the item by reentering the dictionary via a pathway 
from that member to its name. Thus, the subject finds that the edible bird he has 
picked is called a “chicken.” At first glance, the notion that one could find a 
member without knowing the “name” of that member may seem awkward. But, 
on second glance, there is no reason why a particular type of category member, 
or any other concept for that matter, must necessarily have available a single 
“name” for it. Clearly there can be concepts without names. For example, one 
can comprehend the concept of a “navigational instrument used in measuring 
angular distances, especially the altitude of sun, moon, and stars at sea,” without 
the name “sextant” being accessible (Brown & McNeill, 1966). One can surely 
comprehend the meaning of “damp, soft snow” without being a Laplander and 
having a single name for it. The Dictionary-Network Model is a reasonable 
system that allows access of conceptual meaning via the lexical entry in the 
dictionary as well as permitting use of the meaning as a direct guide to the 
lexical entry.” 

Consider Case 2 in which a subject must name an animal beginning with the 
letter D. Again, the subject begins by entering the “animal” category, which is 
accomplished by looking up “animal” in the dictionary, and going directly to 
the portion of memory that contains information of the meaning of “animal.” 
The next step is somewhat different for questions of this type. Here we assume 
that the subject systematically proceeds along pathways leading from animal 
instances to the dictionary representations of the names of those instances. This 


2This view is in agreement with Broadbent (1971). He argues that words do more than 
simply stand for the objects they represent. They function to “select certain stored 
information [p. 469].” If a word simply stood for the object it represented, a paradox 
would result; a sentence such as “Richard Nixon is President of the United States” is 
informative, yet the two names in it both refer to the same entity. Broadbent argues that 
the sentence should be viewed as a transfer of information in memory between two 
previously labeled regions of memory. 
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quasiparallel simultaneous search proceeds towards the dictionary from all 
possible members (though a less dominant member may start off more slowly or 
slightly later). The dictionary is used to determine whether a given animal 
instance is one that begins with the letter D, and the search terminates when one 
is found. 


THE CASE FOR THE DICTIONARY-NETWORK MODEL 


Now that we have outlined a general model for the structure of semantic 
memory and have outlined a specific process model, we tie the general scheme to 
some experimental data. Since the attribute versus name distinction is central to 
the dictionary-network model, the experiments described are concerned with 
further exploration of the difference between searching for an attribute and 
searching for a name. 


The ‘Adjective versus Letter” Experiment 


Grober and Loftus (1974) completed an experiment that was similar in many 
respects to the pilot study discussed earlier. Subjects were shown nouns paired 
with adjectives or letters which were presented in blocked and mixed conditions. 
In the mixed condition, adjective trials (MA) and letter trials (ML) were 
randomly mixed together. In the blocked condition, subjects saw only adjective 
trials (BA) or only letter trials (BL). Across subjects, specific noun-etter and 
noun-adjective stimuli occurred equally often in the mixed and blocked condi- 
tions. Thus, for example, half of the subjects saw “Fruit-P” in a mixed 
condition and half saw it in a blocked condition. Every subject received all four 
conditions and the order of the conditions was counterbalanced across subjects. 
An important addition to this experiment involved the inclusion of an interval 
between the presentation of the noun and restrictor: on half the trials the 
interval was short (.5 sec) while on the remaing trials it was long (2.5 sec). 
Reaction time was measured from the onset of the restrictor. 

Our predictions in terms of the Dictionary-Network Model are as follows: on 
adjective trials, the first step of the process is to enter the category. The next 
step involves finding the appropriate cluster to enter, and cannot begin until the 
subject knows what cluster he is to enter. The subject must wait until the 
adjective is presented before he can begin this next step. Knowing that an 
adjective is coming, or having a long interval of time, should be of no help; 
therefore, there should be no effect of blocking or of interval duration. 

A different situation is assumed for letter trials. Here, the first step of the 
process is entering the category. The next step is a quasi-parallel simultaneous 
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search towards the Dictionary. That is, the subject traces some number of 
pathways leading from category instances to the Dictionary representations of 
those instances. This step can be started during the interval between the 
presentation of the category name and the restricting letter if the subject knows 
a letter is coming; BL trials should therefore be faster than ML trials. Further- 
more, if more searches can be completed (i.e., paths can be traced) in a longer 
interval of time, then having a long interval should help on BL trials. 

First consider the results shown in Fig. 2, which illustrates reaction time in 
mixed and blocked conditions for adjective and letter trials. The most important 
finding was the significant interaction between the type of trial and whether it 
occurred in a mixed or a blocked condition. As predicted, a subject’s response 
time is fastest if he knows a letter is forthcoming. 

To assess the effects of interval duration, we looked at mixed and blocked 
trials separately. For mixed trials, interval duration had no effect. This replicated 
the lack of effect reported by Freedman and Loftus (1971). For blocked trials, 
however, interval duration had a dramatic effect: RTs for letter trials presented 
with a long interval were considerably faster than those in any of the other 
conditions. 

An alternative interpretation of these data is possible. Perhaps on blocks of 
trials involving letter cues, the subject simply sets his mouth to say a word 
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FIG. 2. Reaction time for noun-adjective and noun-letter stimuli in mixed and blocked 
conditions. 
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beginning with the cued letter before (or even simultaneously with) the search 
through memory for an appropriate category member. Having one’s mouth set 
might speed RT by reducing the response execution component. For adjective 
trials, a subject could not use such a response set strategy, and for mixed letter 
trials he might not. To settle this issue, ten subjects participated in an experi- 
ment identical to the “adjective versus letter” experiment except for the fact 
that no interval occurred between presentation of noun and restrictor. If the 
alternative interpresentation is correct, the advantage of the BL condition should 
remain since a subject should still be able to set his mouth to say a word 
beginning with the letter while he is searching for a member. The results 
indicated, however, that without an interval the advantage of the BL condition 
disappeared. 


The “Adjective plus Letter” Experiment 


In this experiment subjects were presented with stimuli consisting of anoun cate- 
gory plus an adjective and a letter (e.g., animal-small-M) and had to produce a 
member of the category that satisfied the two restrictions. That is, the response had 
had to be a member of the category that began with the given letter and to which 
the adjective was applicable (e.g., “mouse” for the above example.). The subjects 
saw the category first, but the order in which the adjective and letter restrictors 
were presented was varied: on half the trials the adjective came .5 sec. before the 
letter (e.g., animal-small-M), while on the remaining trials the letter came first 
(e.g., animal-M-small). Reaction time was taken from the onset of the last 
restrictor. 

The model predicts that a subject should be faster when the adjective is 
presented first. The reasoning is as follows: In either case, the first step of the 
process is to enter the noun category. When the adjective appears first, the 
subject can then enter an appropriate cluster of items, At this point he syste- 
matically proceeds along pathways leading from instances that are within that 
cluster to the dictionary representations of the names of those instances. The 
dictionary is used to determine whether any of the instances begin with the 
appropriate letter, and the search terminates when one is found. As before, this 
quasiparallel simultaneous search can be started during the interval before the 
restricting letter appears, since the subject knows a letter is coming. 

When the letter appears first, however, the picture is somewhat complicated. 
The subject may do one of two things. First, after entering the category, he may 
begin to trace pathways towards the dictionary attempting to find a member 
that begins with the given letter. Once he finds the member, he must go back to 
the semantic network to see if the adjective applies to it. This strategy requires 
an extra “trip” between the dictionary and the network. Alternatively, to save 
the trouble of this extra trip, the subject might simply wait until the adjective 
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arrives, essentially converting a “letter—adjective” trial into an “adjective-Ietter” 
trial. Either way the prediction is a slower RT when the letter appears first.” 

We might mention that our prediction for the “adjective plus letter” experiment 
is not obvious: other quite reasonable models would not make this prediction. 
For example, one type of set-theoretic model assumes that once a category is 
entered, its members are sequentially searched in a relatively fixed order and 
that the search is terminated when an appropriate instance is located (e.g., 
Heider, 1973). To the extent that retrieval involves such a successive consultation 
of the memory store, the speed of retrieval should not be a function of the order 
of presentation of adjective and letter. If each instance is successively examined 
to see if it meets both specifications, and the search terminates when such an 
instance is found, there is no reason to expect that presentation order will affect 
retrieval time. 

A slightly different serial model assumes that subjects enter a subcategory 
defined by the category name and the first restrictor, and begin searching in a 
fixed order for an instance which satisfies the second restrictions. This model 
would predict that RT is faster when the letter is presented first. The reasoning 
is as follows: when the letter is presented first, a very small subcategory is 
usually defined, and finding an instance that fits the adjective restrictor should 
be a very easy matter. For example, if given animal-M-small, searching the 
category “animals beginning with M” should result in rapid production of a 
small animal beginning with M since roughly half of the members of that 
subcategory could conceivably (depending on the subject’s criterion) be con- 
sidered small. When the adjective is presented first, the subcategory defined by 
the noun and the adjective is searched to find an instance that fits the letter 
restrictor. Since the standard of acceptability is much higher now (the instance 
must begin with a particular letter), the expected length of search for an instance 
meeting this standard is longer. In other words, a serial search for an instance 
meeting the letter restriction would involve consultation of many more items. 

The results of the above-mentioned study are shown in Fig. 3, and are in 
accord with the predictions from the dictionary-network model: when the 
adjective comes first, the average reaction time is 1.00 sec, whereas when the 
adjective comes second, it is 1.45 sec. 


3Qne could argue that the reason that “adjective—letter” trials would be faster than 
“letter—adjective” trials is because in the former case you have to read only one letter, while 
in the latter case you have to read an entire word. (Recall that RT was taken from the onset 
of the second restrictor). In answer to this argument, I report some data on the reading 
response to printed letters and words. In an experiment by Cattell (reported in Woodworth 
& Schlosberg, 1954, p. 57), the subjects were shown letters and words and responded by 
reading them aloud. Cattell obtained the following average reaction times from his two 
trained subjects: reading single letters, 409 msec; short words, 388 msec; long words, 431 
msec. 
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FIG. 3. Reaction time as a function of order of presentation of adjective and letter 
restrictors. 


Differential Organization in Output: A Demonstration 


One observable consequence of the Dictionary-Network Model concerns the 
number and organizations of words that a subject produces in a given amount of 
time under various conditions. If we present a subject with an adjective (such as 
“small”) and ask him to write down as many items as he can that have the 
inherent quality of the adjective, his output is organized with regard to semantic 
properties (Loftus 1972). When given a letter stimulus, however, and asked to 
name words that begin with that letter, subjects should do something very 
different: they should be able to produce more items, and these items should 
not necessarily be organized with regard to semantic properties. The reasons for 
these expectations are as follows. When given a letter, the subject enters the 
dictionary at the proper phonetic region, generates an item, tests to see if the 
item is a word, and then outputs the word. The semantic network need not enter 
into the process. When given an adjective, however, the subject must, in addi- 
tion, make use of the semantic network to test whether the particular item at 
hand does, in fact, have the quality of the adjective. Thus, when presented with 
an adjective, fewer items should be produced when the amount of time given is 
limited. 

The experiment that tested this prediction was quite straightforward. Subjects 
were given a notebook with a stimulus written on the top of each page. They 
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were told that if the stimulus was an adjective, they should write down as many 
items as they can that have the quality of the adjective. If the stimulus was a 
letter, they were to write down words beginning with that letter. They were 
given one minute to respond to each stimulus. 

The average number of words produced given an adjective was 9, whereas given 
a letter, subjects produced over 14 items. The output given an adjective stimulus 
was more often than not organized according to semantic properties (as de- 
scribed by Loftus, 1972); the output after a letter stimulus was not so organized. 
To illustrate, when asked to produce words beginning with the letter G, one 
subject began by writing “glue, glib, glide, gloat, gleam. . . .” This protocol is 
typical of what is observed: organization is clearly orthographic or phonemic, 
but not, in general, semantic. 

As a point of interest, we also observed the “S versus Z” effect: in the course 
of one minute, subjects produced about 16 words beginning with S, whereas 
they produced only 6 words beginning with Z. To explain this effect is relatively 
easy: suppose that the process of naming words beginning with any letter 
involves (1) entering the dictionary, (2) generating an item, and (3) testing to see 
if the item is a word. Words are then output while nonwords are rejected. The 
reason more S words than Z words are produced hinges on the fact that there 
simply are more words beginning with S than words beginning with Z. (Count 
the S and Z pages in a regular dictionary or the number of Ss and Zs in a 
Scrabble game for quick proof.) Since less phonetic variety exists for Z words, it 
is likely that, while generating items beginning with Z, more nonwords are 
generated. These must be rejected, and each rejection, of course, takes time. 


THE FUTURE 


The theoretical ideas the experimental studies being conducted in the area of 
semantic memory can be of tremendous benefit to education. For example, we 
will have a much better idea of what and how to teach to the extent that we 
can characterize the memory structures people build and the retrieval mecha- 
nisms people use. 

A second benefit to education is that the techniques being used to determine 
the structure and retrieval of well learned information can be applied to 
information that is in the process of being learned. One attempt to do this 
(Loftus & Loftus, 1974) began in a situation in which natural learning of 
categories was taking place. The situation was a graduate school in psychology 
where students were learning, among other things, the names of psychologists. 
They learn that any given psychologist may be associated with one or more 
subfields of psychology (perception, memory, social, etc.), which is tantamount 
to learning to categorize psychologists with respect to these areas. A question of 
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interest is whether different structures and retrieval strategies might be evident 
in people who are in different stages of graduate school. Will the retrieval 
strategies of the advanced student show any similarity to the retrieval of well 
learned information such as animals, fruits, etc.? 

To answer this question, an experiment was designed which asked graduate 
students to produce the names of psychologists. On any given trial, the psy- 
chologist named had to satisfy two restrictions: both (1) an area of psychology 
and (2) a letter of the alphabet were shown (e.g., ““Learning—B”) and the subject 
had to produce the name of a psychologist that began with the given letter and 
who was associated in the subject’s mind with the given area. For example, a 
subject who was presented with the stimulus “Learning-B” might say “Bower,” 
“Bourne,” or “Blodgett,” among other possibilities. On half the trials, a subject 
saw the area first (e.g., “Learning-B”) while on the remaining trials he saw the 
letter first (e.g., “B-Learning”). Reaction time was taken from the onset of the 
last restrictor. As may be clear without further mention, this “area plus letter” 
experiment is extremely similar to the “adjective plus letter” experiment in 
which subjects had to name, for example, an animal that was small beginning 
with M. In the latter experiment, RT was much faster when the adjective 
preceded the letter. If memory for psychologists is as efficient, one might expect 
the same advantage in RT to obtain when the area precedes the letter. 

The results, shown in Fig. 4, indicate that advanced students were faster when 
the area was presented first, while the beginning students favored the condition 
in which the letter occurred first. This finding makes a great deal of sense when 
you consider what a student of psychology knows. The advanced student is 
aware that the “category” of psychologists is subdivided into areas such as 
learning and perception, just as his category of animals is subdivided. When given 
the area before the letter, this student can restrict himself to the area-defined 
subclass, and then search for a psychologist whose name begins with the 
particular letter requested. The beginning student, however, does not have psy- 
chology so well organized; the field is not yet neatly subdivided. This student 
knows a few important names; he probably knows Freud, Skinner, Piaget, and 
possibly some others. When the letter is given before the area, the beginning 
student probably begins scanning his “list” for a name beginning with the 
particular letter requested, and then produces that name almost irrespective of 
which area is presented. If the student stretches it, Piaget and Skinner could 
both “fit” into quite a few different areas of psychology. 

Regardless of the exact storage or the exact retrieval mechanisms that these 
two types of students are using, two things are clear: the retrieval patterns 
observed in the present study are related to how much instruction a person had 
completed, and (2) the more experience a student has had with the field of 
psychology, the more his retrieval of this information seems to mirror the 
retrieval of material that we know is well organized and learned. It is tempting to 
speculate that reaction time measures can be used to assess—in much more subtle 


14. HOW TO CATCH A ZEBRA IN A SEMANTIC MEMORY 409 


— ADVANCED 


eeeeece BEGINNING 


3 5.00 
Ww 
= 400 
| 
z 
O 300 
= 
3 
w 200 


AREA -LETTER LETTER - AREA 


ORDER OF PRESENTATION 


FIG. 4. Reaction time as a function of order of presentation of area and letter for be- 
ginning and advanced students. 


ways than we now do-a student’s progress in learning new material. Ultimately 
these measures will tell us a great deal about the process of acquiring new 
material. 


CONCLUSION 


In this Chapter I have presented some theoretical ideas and related empirical 
facts concerning semantic memory. A major theme has been the distinction 
between the name of an object and the attributes of that object. The theoretical 
ideas are embodied in a Dictionary-Network Model; the facts accounted for deal 
with a specific type of retrieval. While the Model is not the only possible ex- 
planation for the data, it has considerable value as a framework for (1) discussing 
particular semantic memory experiments and (2) viewing other kinds of memory 
phenomena—category clustering, word-classification findings, the TOT phenom- 
enon, and cognitive acquisition, to name a few. 
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Remember That Old Theory 
of Memory? Well, Forget It!’ 


James J. Jenkins 


University of Minnesota 


From the early 1950s till the early 1960s, Wallace Russell and I directed a 
project aimed at examining the role of language in behavior. While that project 
involved most of the staff of the psychology department at Minnesota at one 
time or another, and concerned itself with everything from clinical protocols to 
nonsense syllables, the long-term, persisting core of the project turned out to be 
the studies that Russell and I performed with word associations. Here we 
examined the effects of established associations on learning, recall, recognition, 
perception, and problem solving. Everywhere we looked we seemed to find 
powerful relationships. Then we proceeded to establish associations de novo and 
repeated these studies with materials constructed entirely in the laboratory so 
we could control the histories of the associations that we were dealing with. (An 
account of these studies is given in Jenkins, 1963, and Jenkins, 1970.) 

Those were very exciting days for us. When the research program began, there 
was virtually no experimental evidence for mediation theory, although such 
theory was widely applied in the analysis of higher-mental processes (e.g. 
Osgood, 1953). By the time the program was over, mediational phenomena were 
well established, due in part to our efforts. When the program began, experi- 
mentalists (with a few exceptions like Cofer & Foley, 1942) regarded association 
norms as something belonging in the clinical domain. By the time the program 
was over, experimentalists were using norms for all kinds of materials as 
important ingredients in their investigations. But, on the other hand, when the 
research program began, we were hopeful that we could proceed directly from 
our experimental work to a general account of language itself. By the time the 
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program was over, we were convinced that that particular journey was not 
possible. 

So, in spite of the fact that the theory we espoused was gaining in popularity 
and in spite of the fact that we had produced a very large number of successful 
experiments (Russell and I had done about 120 studies and David S. Palermo 
and I had done about 40 or 50), we abandoned that line of research. The trouble 
was that we became convinced that the “higher mental processes” were not the 
kinds of things that we had originally believed them to be. In spite of the fact 
that we were moving rapidly, we were not going where we wanted to go. 

As we struggled with this problem, I became convinced that I was caught in a 
metatheoretical trap, a trap that I had built for myself without ever realizing it. 
My friends and I were carrying about with us common sets of unexamined 
beliefs and attitudes that both directed and limited our research and theory in 
many ways. I will leave it to others to examine their own presuppositions; the 
following are the ones that I now see as dominant in my thinking in the 1950s. 


1. Units. I believed that words were the fundamental units of language. To me 
this was natural and obvious. 

2. Relations. I believed that there was one kind of relation between words, 
associative linkage. Words became linked to each other through use together or 
use in the same “frames.” 

3. Structures. I believed that mental structures (if there really were any such) 
were assemblies of links, essentially chains of the fundamental units in their 
fundamental relation. Hierarchies, like Hull’s “habit family hierarchies,” were 
simply lists of chains varying in strength, so that one was employed first, then 
another. 

4. Complex behaviors. I believed that complex behaviors were built of simple 
subassemblies; that things got more complicated but not different “in kind.” 
(This belief justified concern with simple units and relations and sanctioned 
experiments on such units in the faith that they would eventually add up to the 
complex behaviors of language.) 

5. Mechanistic explanation, I believed that explanation ultimately rested on a 
description of the machinery that produced the behavior. I believed that a de- 
scription of the machinery plus the history of the organism and its present 
circumstances inevitably predicted its behavior. There are two correlaries of this 
belief. First, I thought that the action of the machinery must necessarily be 
automatic. And, second, I could see that most of the interesting behaviors had to 
be explained by extensive reliance on learning and memory. 


While these particular expressions of belief are probably unique to me, it is 
easy to see that they embrace the usual metatheoretical presuppositions that we 
identify with associationism. The associationist believes in some kind or kinds of 
basic units. (The number is not important.) He believes in some kind or kinds of 
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relations between units. He holds that the more complex behaviors are the same 
“in kind” as simple behaviors. He believes that explanation consists of an 
explication of mechanism. He believes that behavior is automatic and for any 
kind of complex behavior, he relies on memory. 

This view is so pervasive in American psychology that it is almost coextensive 
with being an experimentalist. Indeed, I think many of us confuse the dicta of 
associationism with the grounds of empirical science itself. But associationism is 
only one view; it is not a necessary view. 

In this chapter, I would like to sketch for you an alternative position and 
present some experiments that convince me that there is exciting and productive 
work to do when we approach our field from another perspective. I will argue 
that this position alters the way we interpret phenomena, emphasizes the 
importance of natural problems and realistic methodology and changes what 
counts as an explanation. 

The domain I have chosen is memory. Associationism is most dominant and 
clearly revealed in this area and, as I have pointed out, a general associationist 
account of almost anything leans heavily on memory. 

In place of the traditional analysis, I suggest a contextualist approach. This 
means not only that the analysis of memory must deal with contextual variables 
but also that what memory is depends on context. This view clashes with the 
associative view but I hope that I can convince you that the contextual approach 
is closer to the truth in the simple sense that it is in better correspondence with 
experimental data than the usual associative theory. 


A CONTEXTUALIST FORMULATION 


The term contextualism is not highly familiar to American psychologists, but it 
is an American philosophical position that has been intimately intertwined with 
American psychology for three-quarters of a century.? Another name for it is 
pragmatism and it has its roots in William James, C. S. Peirce, and John Dewey. 
Unfortunately, to most of us the term pragmatism refers to only one aspect of 
the total philosophy, namely the use of practical criteria for judging “truth.” 
That aspect is not the one I want to emphasize here, nor is it the aspect that 
bears most directly on conceptions of memory. Hence, I have chosen the less 
familiar, but more descriptive name, contextualism. 

Contextualism holds that experience consists of events. Events have a quality 
as a whole, By quality is meant the total meaning of the event. The quality of 
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the event is the resultant of the interaction of the experiencer and the world, 
that is, the interaction of the organism and the physical relations that provide 
support for the experience. The relations can be thought of and analyzed into 
textures. A texture in turn consists of strands lying in a context. 

Consider as an example of an event, a listener hearing a sentence. The total 
meaning of the sentence in that context is the quality of the event. We might 
note that the event persists in time over some period but that all of it at once 
forms a single, unified, psychological moment. This is the “specious present” of 
William James—the span of dimensionalized time is psychologically unified in 
experiencing the quality of the event. The textures that support the quality of 
the event are those of the hearer and those of the world. If our concern is with 
the qualities of the utterance, we can turn our attention to its texture, which 
consists of the words and grammatical relations between them in that context. 
Within that texture we can easily make out strands that make up the phrases. We 
can also see that some of the strands extend across the context of the phrases, 
that a word in one phrase depends on or determines a word elsewhere. We can 
become aware of the choices and presuppositions that the analyzed strands refer 
to, Some words and phrases can be shown to acquire their exact meaning in that 
event, from their relations to more far-flung aspects of the context; a gesture, a 
reference, a topic previously spoken of, an event known to both the speaker and 
the hearer to be in the future, presumed common knowledge and so on. 

In this fashion, the analysis of the utterance will proceed far beyond the 
immediate physical context of the sentence and far beyond the physical stimuli 
immediately present. For the contextualist, no analysis is “the complete analy- 
sis.” All analyses eventually “sheer away” from the event into more extensive 
contexts. This argues that there is no one analysis, no final set of units, no one 
set of relations, no claim to reducibility, in short, no single and unified account 
of anything. What makes an analysis good or bad for us is its appropriateness for 
our research and science and its utility in our pursuit of understanding and 
application. 

This is a crucial difference between associationism and contextualism. Associa- 
tionism asserts that there is some one correct and final analysis of any psycho- 
logical event in terms of a set of basic units and their basic relations. When you 
have reduced an event to these terms, you are through; the job is done once and 
or all. The contextualist takes the much less comfortable position that a 

complete” or “final” analysis is a myth; that analyses mean something only in 
terms of their utilities for some purposes. This means that being a psychologist is 
going to be much more difficult than we used to think it to be. And it means 
that the domain of experimentation and experimental methodology will have to 
be criticized more intently than we thought necessary before. 

Before I can persuade you to join in these discomforts, I am sure that I must 
offer you something better than exhortations. Let us see if this view contributes 
to the generation and understanding of experiments. 
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EXPERIMENTAL EXAMPLES 


I want to put before you three types of experiments that we have been 
performing in our laboratories at Minnesota in the last few years. I think each 
type illustrates some important aspect of the contextualist’s position. The free 
recall studies that we will consider first, showed us the importance of “compre- 
hension” in determining the quality of an event. The experiments on the 
recognition of events, that we will consider next, showed us the unimportance of 
contiguity in time and taught us some novel things about the activity involved in 
the recognition of related and unrelated items, Finally, the studies of the 
integration of information, that we will consider last, brought both of these 
together and showed us that the subject’s orientation toward comprehension 
changed the nature of what was remembered in such a way as to produce radical 
changes in recall and recognition. 


Free Recall 


The free recall experiments started innocently enough in pursuit of a simple 
question: If we could get subjects to attend to the meaning of words, would 
they show more associative clustering in recall than subjects who attended to 
other aspects of the words? 

Associative clustering is an old phenomenon (Jenkins & Russell, 1952). If one 
presents stimuli and responses from a word association test in random order and 
asks subjects to recall the words, the subjects cluster the associates together in 
recall. For example, if the list of words contains table, king, and salt early in the 
list, and words like chair, queen, and pepper later in the list, the subjects will 
recall the words out of order; that is, they tend to recall table and chair together, 
king and queen together and salt and pepper together. In the late 1950s we 
assumed that this effect was an automatic consequence of associative strength, 
but in the 1960s we thought we had better take another look at it. Tom Hyde 
and I (Hyde & Jenkins, 1969) decided to see whether what the subject was doing 
when he heard the words had an effect on the recall phenomena. 

For a task that would require comprehension of the stimulus words, we asked 
the subjects to rate whether the words were pleasant or unpleasant. For non- 
comprehension tasks we decided to focus the subjects on the form of the words; 
we asked them to indicate whether the words they heard would be spelled with 
an “e,” or we asked them to estimate the number of letters in each word. We 
called these formal tasks. All subjects heard the same word list in the same order 
via tape recorder. 

The results were very interesting. When subjects performed the task that 
required that they comprehend the words, their recall was excellent and they 
showed a high degree of associative clustering. They were as proficient in recall 
and clustering as a control group of intentional learners who performed no 


418 JAMES J. JENKINS 


orienting task at all! On the other hand, the subjects who performed the formal 
tasks recalled only half as many words and showed little associative clustering. 
(As the literature suggests, whether the learning was incidental or intentional had 
no appreciable effect on recall for any of the groups doing orienting tasks.) It 
was impressively clear that the most important determiner of recall was the 
nature of the event the subjects experienced, when their task brought them into 
contact with the stimulus words. It suggests that subjects recall the quality of 
the events they have experienced, not stimuli to which they have been exposed. 
And in spite of the fact that we often think of comprehension as automatic for 
skilled readers and listeners (such as our subjects) a radical difference appears in 
recall between subjects who were required to comprehend the words and those 
who were not. 

As contextualists we must say that the occurrence of the word on the audio 
tape is no quarantee of any particular experience for any auditor. The quality of 
his experience is determined by the interaction that he has with the physical 
texture presented. That texture can provide support for many different kinds of 
experience. In this simple case, the quality of the experience is governed largely 
by the task instructions that we have given the subject insofar as they affect his 
comprehension of the words. 

But the associationist is not easily upset by this experiment. He draws our 
attention to explanations that do not have recourse to terms like comprehension 
and quality of events. And he objects that the phenomenon is limited. Let us 
look at some of these arguments and objections briefly. 

“Maybe,” say the critics, “the results are due to the number of units each 
group has to consider. The comprehension groups deal with words; the formal 
groups deal with letters. The formal groups are penalized because they have to 
think about a lot of units and the recall is in terms of the kinds of units the 
comprehension group is using.” Fair enough. Let us change tasks and see if this 
possible explanation holds up. 

Carroll Johnston and I (Johnston & Jenkins, 1971) selected rhyming as our 
formal orienting task, and we selected appropriate modifiers (giving adjectives 
for nouns and nouns for adjectives) for our comprehension orienting task. Thus, 
all subjects listened to words and wrote other words in response. The results 
duplicated the first experiment. The subjects doing the formal task (making 
thymes) showed poor recall and little clustering. The subjects performing the 
task that required comprehension showed recall and clustering essentially like 
that of the intentional learning group which had no orienting task to perform. 
Exit the first counter hypothesis. 

It was suggested that the effect relied entirely on the organization made 
possible by the associates in the list and that the tasks did not really affect recall 
of individual words. This notion was quickly refuted by running comparable 
experiments with both related and unrelated words (Hyde & Jenkins, 1973). 
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Tasks requiring comprehension resulted in twice as much recall as formal tasks 
that did not require comprehension, even including formal tasks like assigning 
part of speech to the words. 

The third and fourth alternative explanations usually arose together—although 
they are contradictory. We call them the “time hypothesis” and the “effort 
hypothesis.” The arguments are as follows: 


1. The time hypothesis: The tasks requiring comprehension must be easy. The 
subjects do them rapidly and have the remaining time for rehearsal. The formal 
tasks take more time and, thus, these subjects have little opportunity to 
rehearse. The differences you find are due to differential opportunities for 
rehearsal. 

2. The effort hypothesis: The comprehension tasks are difficult. These sub- 
jects must pay more attention and devote more effort to the task in “pro- 
cessing” words. Thus, they remember more readily. The formal tasks do not 
require much attention or effort and thus are not remembered. 


David Walsh and I (Walsh & Jenkins, 1973) tackled these “mirror twin” 
hypotheses in an unorthodox kind of study that Walsh invented. Take a list of 
words, Determine their recall level with a good comprehension task. Deter- 
mine the recall level with a formal task. Now, using the same time intervals 
for presentation, ask new subjects to do two tasks in the interval for each 
word. If the time hypothesis is correct, recall on any combination of tasks 
should be poorer than recall on either task alone. If the effort hypothesis is 
correct, recall on any combination of tasks should be better than recall on 
either task alone. However, if the quality of the experience is the important 
factor (as we supposed) then comprehension tasks should produce good re- 
call whether they are performed alone or in combination with some other 
task. On the other hand, formal tasks should result in poor recall alone and 
in combination with each other. But when they are combined with a compre- 
hension task, they should not detract from it. 

This reasoning presents a nice experiment. If the time hypothesis is correct, all 
combinations of tasks will yield poor recall. If the effort hypothesis is correct, 
all combinations will result in improved recall. If the contextualist position is 
correct, those combinations containing a comprehension task will result in good 
recall and those combinations that involve only formal tasks will result in poor 
recall. 

In an extensive series of experiments with many different kinds of formal 
tasks, the outcome was clearly in favor of the qualitative contextualist position. 
Both the time and the effort hypotheses were defeated. Comprehension tasks, 
alone or in combination with anything else, resulted in recall as good as that of 
an intentional learning group which performed no orienting task. Combinations 
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of two formal tasks resulted in poor recall. (See Hyde, 1973, for an independent 
replication.) 

The last candidate for an alternative explanation was related to the feeling that 
something about the experiment was artifactual. Perhaps, our colleagues sug- 
gested, the general “set” of the subjects was affected in some way by the 
comprehension instructions—that it was not the particular task or the fact that it 
required comprehension but, rather, that the subjects developed a different set 
which somehow affected the recall. Now, surely overall set has an important 
influence on the outcome of an experiment, but our particular analysis denied 
that the effects we were getting depended on such an overall orientation. 

To investigate this problem, Robert Till and I (Till & Jenkins, 1973), following 
a suggestion from Endel Tulving, performed a series of experiments with a 
within-subject, within-list design. In this experiment each subject performed 
each kind of task. The subject heard each word followed by a letter, that is, 
Table—A. The word was the object of his attention as usual. The letter was a cue 
that told him which task to perform on that particular word. Some tasks 
required comprehension of the word and some did not. The technique permits 
the comparison of the recall of words that have been involved in comprehension 
tasks versus formal tasks for each subject during a single list presentation. 

The results are reassuring with respect to the earlier analysis. Subjects recalled 
twice as many words from the subsets on which they performed comprehension 
tasks as they did from the subsets on which they performed formal tasks. 

These experiments, and many others that we have performed, convince me 
that the best way to think of this experiment is in the contextualist’s frame- 
work, taking the event to be the interaction of the subject with the individual 
word. The quality of this event is radically different in tasks requiring compre- 
hension, from the quality of the event when comprehension is not required. 
These differences in quality are reflected in the radical differences observed in 
recall. 

How could you further enhance recall of the word list? For the contextualist a 
reasonable move would be to develop a situation in which context was even 
more novel and the comprehension had to be developed in some special manner 
related to that context. Buford Wilson and John Bransford (personal com- 
munication), reasoning in just this fashion, created such an experiment. They 
asked one group of subjects to rate each word as “pleasant or unpleasant to you 
if you were on a desert island.” As they predicted, the recall of words by this 
group was even better than the recall by the usual “pleasant-unpleasant” rating 
group. 

I must also point out before going on, that the analysis given here applies 
equally well to other tasks that we know will facilitate the free recall of single 
words: tasks such as asking subjects to form images, to rate words for vividness 
of imagery, to sort words into semantic categories, etc. If the task requires 
comprehension, the quality of the event is such that it is readily recalled. 
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EVENT RECOGNITION 


Let us proceed to another type of experiment that I will call “event recognition” 
for want of a better name. What I want to illustrate here is that the phenomena 
we find in recognition depend on the quality of event that the subject constructs 
from the experimental material during the acquisition phase of the experiment. 
(In the free recall studies we kept the materials constant and showed the effects 
of varied tasks; here we will keep the tasks constant and show the effect of 
varying the materials.) 

For our experimental example let us take the exciting and original work of 
John Bransford and Jeffery Franks on the abstraction of linguistic ideas (Brans- 
ford & Franks, 1971). For our purposes the seminal study they invented as 
graduate students can serve as the key illustration. The material in the study was 
a set of sentences. The subjects were asked to listen to the sentences, hold them 
in memory for a few seconds and answer an elliptical question about each one. 
Although there were no instructions about the nature of the sentences that were 
used in the experiment, it is apparent to even a casual listener (or reader) that 
the sentences fall into four interrelated groups. Each group of sentences consists 
of the “elements” of some overall event, taken one, two or three at a time in 
various sentence compositions. 

A typical set of sentences that a subject might hear in such a study is given in 
Table 1. The reader who has never participated in such an experiment is invited 
to read each acquisition sentence, one at a time, count to five, answer the 
question, then go on to the next sentence, repeat the routine, etc. The reader 
should then take the recognition test. The best way to understand the experi- 
ment is to experience it. 

After the acquisition set was read to the subjects, the recognition set of 
sentences was presented and the subjects were asked to judge whether they had 
heard each sentence before and to give a confidence rating to the judgment. 

The results of the study are just the opposite of what one would expect from 
an “associativelink” model of memory. In general, the subjects are certain 
(with ratings between plus four and plus five on a scale of five) that they have 
heard the full, complex sentences that describe the four events. In fact, of 
course, they have not heard those sentences, nor have they even heard any 
sentences that long. As the sentences get shorter and shorter (down to “The cat 
jumped on the table” and “The girl lives next door”) the subjects are less certain 
about whether they heard them or not, although they generally believe that they 
have. 

Whether the sentence actually occurred or not in the acquisition set has very 
little influence on its rating as long as it is one of the possible sentences that 
describes the event. (In the recognition set given in the Table, no “old” sentence 
is presented, for example, yet subjects will estimate that from 5 to 80% of the 
sentences were presented earlier.) If, however, the sentences are mixed (for 
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TABLE 1 
Sample Experiment by Bransford and Franks (1971) 


Acquisition Sentences: Read each sentence, count to five, answer the question, go on to the next 


sentence. 


The girl broke the window on the porch. 

The tree in the front yard shaded the man who was smoking his pipe. 
The hill was steep. 

The cat, running from the barking dog, jumped on the table. 
The tree was tall. 

The old car climbed the hill. 

The cat running from the dog jumped on the table. 

The girl who lives next door broke the window on the porch. 
The car pulled the trailer. 

The scared cat was running from the barking dog. 

The girl lives next door. 

The tree shaded the man who was smoking his pipe. 

The scared cat jumped on the table. 

The girl who lives next door broke the large window. 

The man was smoking his pipe. 

The old car climbed the steep hill. 

The large window was on the porch. 

The tall tree was in the front yard. 

The car pulling the trailer climbed the steep hill. 

The cat jumped on the table. 

The tall tree in the front yard shaded the man. 

The car pulling the trailer climbed the hill. 

The dog was barking. 

The window was large. 


STOP. Cover the preceding sentences. 


Broke what? 
Where? 
What was? 
From what? 
Was what? 
What did? 
Where? 
Lives where? 
Did what? 
What was? 
Who does? 
What did? 
What did? 
Broke what? 
Who was? 
The what? 
Where? 
What was? 
Did what? 
Where? 

Did what? 
Which car? 
Was what? 
What was? 


Now read each sentence below and decide if it is a sentence from the list given above. 


Test Set: How many are new? 


The car climbed the hill. 

The girl who lives next door broke the window. 

The old man who was smoking his pipe climbed the steep hill. 
The tree was in the front yard. 

The scared cat, running from the barking dog, jumped on the table. 
The window was on the porch. 

The barking dog jumped on the old car in the front yard. 

The tree in the front yard shaded the man. 

The cat was running from the dog. 

The old car pulled the trailer. 

The tall tree in the front yard shaded the old car. 

The tall tree shaded the man who was smoking his pipe. 


(old___, new ___) 
(old __, new —) 
(old __, new —) 


(old __, new —) 
(old __, new —_) 
(old —_, new —) 


(old —, new —) 
(old —, new —) 
(old —, new —) 
(old __, new —) 
(old —, new —) 


(old __, new —) 


(continued) 
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TABLE 1—Continued 


The scared cat was running from the dog. (old __, new —) 
The old car, pulling the trailer, climbed the hill. (old —, new —) 
The girl who lives next door broke the large window on the porch. (old —, new —) 
The tall tree shaded the man. (old —, new —) 
The cat was running from the barking dog. (old __, new —) 
The car was old. (old —, new —) 
The girl broke the large window. (old —, new —) 
The scared cat ran from the barking dog that jumped on the table. (old —, new —) 
The scared cat, running from the dog, jumped on the table. (old __, new —) 
The old car pulling the trailer climbed the steep hill. (old __, new —) 
The girl broke the large window on the porch. (old —, new —) 
The scared cat which broke the window on the porch climbed the tree. (old __, new —) 
The tree shaded the man. (old —, new —) 
The car climbed the steep hill. (old __, new —) 
The girl broke the window. (old —, new —) 
The man who lives next door broke the large window on the porch. (old __, new —) 
The tall tree in the front yard shaded the man who was smoking his pipe. (old —, new —) 
The cat was scared. (old —, new —) 


STOP. Count the number of sentences judged “old.” 


example, “The scared cat running from the barking dog broke the window”), or 
if a new arbitrary fact is introduced (“The tall girl who lives next door broke the 
window”), subjects reject the sentence with certainty. 

What has happened here is clear to the contextualist. The subjects have used 
the various strands repeatedly available in the texture of the experiment to 
construct four events which are completely described by the four long, complex 
sentences. The quality of each of the events is indeed the total meaning of the 
complex sentence. Once the fusion of the strands into events has occurred 
(particularly since the strands are heard over and over again in various combina- 
tions), the subject cannot perform an analysis to recover the exact pattern of 
input that furnished support for the construction that he made. 

The analysis here is very much like the contextualist’s account of the develop- 
ment of science—pulling together separate strands and seeing the unity of events 
even though they are not presented contiguously in time or space. The notion of 
fusion is an old one in the contextualist tradition and is often appealed to in 
explanation of the paucity of analysis available in aesthetic experience or 
mystical experience. (And I might suggest in passing that in the case where 
highly fused events arise from strands that are not systematically presented, 
historical reconstruction as to the sources of the event will as a general rule be 
unsuccessful.) 

Regardless of our particular analysis, it is clear that the phenomena disclosed 
by these experiments pose formidable problems for storage theories of memory. 
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Selection and integration of relevant strands and the accompanying phenomena 
of fusion are stubborn facts that cannot be set lightly aside. But such facts are 
not easily discussed in the associative framework. 

If one wants to “push” the contextual analysis a little, an enlightening 
experiment is one that keeps the superficial form of the basic experiment above, 
but changes the quality of the events involved. If one takes away the “contex- 
tual glue” of the four themes, the results of the experiment change in dramatic 
ways. Peterson and McIntyre (1973) presented individual, unrelated sentences 
of the same length and grammatical structure as in the original experiment. 
When they examined recognition data for sentences that were actually pre- 
sented, they found that the recognition confidence curve ran in the opposite 
direction. With related sentences the subjects were sure that they heard the long, 
complex sentences and were uncertain about the short, simple ones. With 
unrelated sentences, subjects were sure about the short, simple sentences, but 
uncertain about the long ones. Of course, in a study with unrelated sentences, 
the only events available to the subject are the individual sentences themselves. 
The shorter and simpler the event, the more likely the subject can satisfy himself 
that it is really the same; the longer it is, the less likely he is to have integrated 
the event and the more possibilities there are that something is different. 

We must also notice that the phenomena that are apparent here are equally 
evident in a host of psycholinguistic experiments. The studies by Sachs (1967) 
on memory for form and content of sentences in stories, illustrate both the 
salience of quality and the effects of fusion. When her subjects were queried 
about a sentence that was embedded in the context of a story, the form of the 
sentence was incompletely remembered. The meaning of the sentence, however, 
was readily retained since it was an integrated strand in the texture of the larger 
event. Violations of meaning or sense were readily detected but form vanished 
with the specious present. 

If we contrast Sachs’ study with the psycholinguistic experiments that have 
used unrelated sentences (e.g., Clifton & Odom, 1966), we have another illustra- 
tion of recognition phenomena changing when the quality of the event changes. 
In the psycholinguistic experiments, for example, the syntactic form of the 
sentence is systematically related to false recognition and false recall, but the 
truth or falsity of the content described by the sentence (the difference between 
affirmation and negation) is no more important than whether the sentence is 
active or passive in form. The point is, of course, that there is no context in 
which the content of the sentence makes any sense or any difference. Hence, 
affirmation and negation have only syntactic meaning. That is, if each sentence 
is an unrelated event, then only variables at the level of construction of the 
sentence are of any importance. 

Experiments on miniature linguistic systems reveal similar phenomena. If 
subjects are trained to label items in a color-form matrix such as that used by 
Esper (1925), they learn the system but ordinarily do not remember which 
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specific instances they were given in the training trials, even though they may 
have seen the same instances many times. In general, memory is for systems, not 
for instances. With a small set of examples, subjects may remember particulars. 
With a liberal set of examples, however, they cannot reliably discriminate the 
instances on which they were trained from other acceptable instances in the 
system (Foss, 1968; Keeney, 1969; Segal, 1962). 


INTEGRATING INFORMATION 


I want to present now a more complex task that requires a good deal of work on 
the part of the subject. In this experiment the subject must “problem solve” to 
achieve the complete event that we want him to apprehend. This experiment was 
developed by J. Richard Barclay (1971) in extending the work of Bransford and 
Franks, Barclay’s experiments show both of the phenomena that we have noted 
earlier. First, changing the subject’s task changes the kind of event that the 
subject experiences. Second, taking individual sentences as events results in very 
different recall and recognition behaviors than those observed when the entire 
experience is integrated into a single event. 

In several different experiments Barclay presented subjects with sets of sen- 
tences. The sentences gave a piecemeal account of the pairwise relations of a 
series of five objects that were arrayed on some dimension such as “left-right” 
or “taller-shorter” or “faster-slower”; for example, there might be five animals 
standing in a left-right array, five men varying in height, or five cars varying in 
speed. 

The “comprehending” subjects were told that they would hear sentences 
describing the relations between the five animals (or men, or cars) and that they 
were to listen carefully to the sentences so that at the end they could properly 
line up the objects in order. The control subjects were told that they would hear 
a set of sentences that were about animals (or men or cars) and that they were to 
try to remember the sentences that they heard. After hearing the sentences 
several times, the subjects were given a comprehension test and a test of sentence 
recognition or recall. 

The first result of the experiments was the expected finding that 80-100% of 
the comprehending subjects constructed the correct array of objects. However, 
only 0-10% of the subjects who were trying to remember the sentences could do 
so, despite the fact that they heard the same information as subjects in the 
“comprehension” group. The second result was that the comprehending subjects 
were excellent in discriminating true sentences from false sentences but could 
not remember which of the true sentences they had actually heard. All of the 
true sentences seemed familiar to them, and about equally so, whether they were 
original sentences, correct paraphrases, or correct inferences from the original 
sentences. 
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The group that was specifically told to remember the sentences presented 
quite a different picture. These subjects were poor at recognizing which of the 
sentences they had heard and almost equally poor at separating true sentences 
from false ones. In other words, when subjects considered the sentences as the 
events of the experiment, they learned very little, their confusions were deter- 
mined by variables at the sentence construction level, and they showed evidence 
of massive interference. With the array itself as the event to be constructed, 
specific sentence memory was essentially nil; but the general constraint of what 
was an acceptable or “true” sentence was virtually perfect. The same results 
were obtained in recall experiments. The comprehending subjects recalled no 
false sentences and “recalled” true sentences independently of whether they had 
heard them or not. 

This experiment is a dramatic example of the principles we have encountered 
in both of the foregoing types of experiments. First, what the subject was 
instructed to do with the input was the most important determiner of the nature 
of what was remembered. Second, when the event itself was well fused, the 
specific instances from which the event was constructed were not readily 
discriminated. Finally, the availability of the global event in this case makes it 
possible for the subject to reject one class of sentences (the false ones) and to 
accept another class of sentences (the true ones) as likely to have occurred even 
though he could not discriminate what actually occurred. That is, the event (the 
representation of the array which is remembered) is related to sentences in such 
a way that permissible utterances can be generated or recognized with great 
accuracy even though actual, historical sentences cannot. What is remembered is 
definitely not verbal and probably not even linguistic, but it plays a key role in 
determining which linguistic constructions are recognized and recalled. 


DISCUSSION 


What kinds of lessons can we draw from these experiments? First and foremost, 
I think it is clear that we should shun any notion that memory consists of a 
specific system that operates with one set of rules on one kind of unit. What is 
remembered in a given situation depends on the physical and psychological 
context in which the event was experienced, the knowledge and skills that the 
subject brings to the context, the situation in which we ask for evidence for 
remembering and the relation of what the subject remembers to what the 
experimenter demands. 

Contextualism calls us back to considering what the subject believes and 
knows when we talk of memory. It simultaneously suggests that we look to the 
sources of this belief and to the subject’s ways of constructing and reconstruct- 
ing his experience. William James pointed out that the only thing that dis- 


15. REMEMBER THAT OLD THEORY OF MEMORY? WELL, FORGET IT! 427 


tinguishes memory from the other higher mental processes is just that belief. He 
argued that there is nothing unique in the object of memory and no special 
faculty evident in its manifestations. Apart from the belief that the construction 
of the mind is attributed to the past, he saw nothing to set memory apart from 
perception, imagination, comparison and reasoning. 

Such a claim is unsettling because it says: Memory is not a box in a flow 
diagram. It is also threatening since it seems to demand an understanding of all 
“the higher-mental processes” at once. Yet, that is what the data of our 
experiments suggest. Surely, these studies imply that we cannot deal with 
memory without dealing with instructions, perception, comprehension, in- 
ference, problem solving and all the other processes that contribute to the 
construction of events. The theoretical contextualist agrees. He argues that there 
is no other approach. To study memory without studying perception is to invite 
disaster in one of two ways: on the one hand, failing to understand the in- 
puts to the subject or, on the other hand, pushing all the difficult problems 
out of memory into the unknown perceptual domain for someone else to study. 
To study memory without studying inference is to be baffled by the transforma- 
tions that the subject puts on his experience. To study memory without 
studying language is to be confounded by paraphrase and the issues of meaning 
And so on. 

The contextualist avoids despair at this point, not by predicting success for a 
global assault on all higher mental processes simultaneously but by asking again, 
“What kind of an analysis of memory will be useful to you in the kinds of 
problems you are facing? What kinds of events concern you?” If you limit the 
events for some purpose, he will lead you into an analysis of context, textures 
that support experiences, strands that interrelate aspect of experiences, etc. 

You should notice that methodological issues and the choices of experimental 
paradigms become crucial when you take the contextualist view. It is not true on 
this view that one can study any part of the “cosmic machine” to just as good an 
effect as any other part. The important thing is to pick the right kinds of events 
for your purposes. And it is true on this view that a whole theory of an 
experiment can be elaborated without contributing in an important way to the 
science because the situation is artificial and nonrepresentative in just the senses 
that determine its peculiar phenomena. In short, contextualism stresses relating 
one’s laboratory problems to the ecologically valid problems of everyday life. 


CONCLUSIONS 
It would be easy to believe from this brief presentation that contextualism is 


merely eclecticism given a fancy name. It would also be easy to suppose that my 
plea is for miniature theories that will eventually integrate into a global theory. 
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And it is easy to imagine that all of us could be contextualists and at the same 
time continue working in just the same fashion that we always have on associa- 
tive theories. But none of these beliefs is correct. 

Contextualism is historically tolerant of other approaches and is always willing 
to examine a claim that the cosmic machine is all of one piece. But at rock 
bottom there is a profound difference in belief between associationism, which 
presupposes fundamental units and relations out of which all else is constructed, 
and contextualism, which presupposes that events are primary and that the 
quality of events determines what the possibilities are for a host of analyses. 

The associationist finds that semantic features (or word associates or images or 
what have you) are important in predicting recall of individual words. “Fine” 
says the contextualist, “but do not bet any appreciable amounts of money that 
those will be important considerations when you combine words in sentences or 
embed them in different contexts.” 

The associationist finds that grammatical features are predictive of memory 
distortions in single sentences and believes that he has a theory of comprehen- 
sion. “Look out,” says the contextualist, “Those are unlikely candidates for 
relevant variables for understanding sentences in context.” And so it goes. 

The contextualist does not believe in one order of phenomena that dominates 
the psychological world. He does not believe that there is a single paradigm of 
explanation such as mechanism. Explanation for the contextualist consists of 
explicating for some purpose the quality of an event that we find important or 
interesting. That entails in turn a description of its spread in time and its degree 
of fusion. It further calls for a discussion of the textures that support the quality 
of the event and the strands making up the texture and relating it to relevant 
aspects of its context. Unlike the associationist explanation, the contextualist 
explanation is put together from the top down, rather than from the bottom up, 
and it is oriented toward the event rather than toward the supposed machinery. 

I believe that the field of memory research is moving towards contextualism 
and I, for one, welcome it. I think that an awareness and acceptance of this 
movement will free us to explore the ecologically valid problems of memory that 
are of real importance to us. I think it will help us to be alert to the 
identification of new, emergent phenomena as they are encountered. In short, I 
think that contextualism provides a real, viable and optimistic alternative to the 
oldest theory of them all. 
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The amount of detailed information which 
an individual has at his command and his 
theoretical elaborations of the same are 
mutually dependent; they grow in and 
through each other. It is because of the in- 
definite and little specialized character of 
our knowledge that the theories concerning 
the processes of memory, reproduction, and 
association have been up to the present time 
of so little value for a proper comprehension 
of these processes. For example, to express 
our ideas concerning their physical basis we 
use different metaphors-stored up ideas, 
engraved images, well-beaten paths. There is 
only one thing certain about these figures of 
speech and that is that they are not suitable 
(Ebbinghaus, 1885, p. 5]. 


This chapter argues for the value of attempting to “unexplain”” memory. Our 
purpose is not to deny the importance of remembering, nor to deny that past 
experiences affect organisms’ subsequent interactions with the environment. But 
we question the fruitfulness of assuming that a concept of memory underlies 
these events. 

Of course, there are many ways in which one can use the term “memory.” If 
the term is used to mean “whatever is responsible for remembering” or “what- 
ever factors determine the effects of previous experiences on subsequent inter- 
actions” then it becomes an essentially circular concept. But is the term 
“memory” usually used in this way? It seems reasonable to suggest that current 
uses of the term memory frequently involve tacit or explicit assumptions not too 
different from those noted by Ebbinghaus: for example, that memory can be 
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broken down into a set of memories, that these consist of relatively independent 
traces that are stored in some location, that these traces must be searched for 
and retrieved in order to produce remembering, and that appropriate traces must 
be “contacted” in order for past experiences to have their effects on subsequent 
events. If memory is defined in this way, it becomes important to consider the 
possibility that the concept of memory (and memories) is simply one of many 
general hypotheses about the processes underlying remembering and the manner 
in which previous experiences have their effects on current events. In short, the 
“searching for traces” conceptualization of memory (hereafter called the mem- 
ory metaphor) may simply be one of many possible metaphors, and it may not 
be the most fruitful metaphor. 

This chapter begins the task of questioning the fruitfulness of the prevailing 
memory metaphor. To what extent does it lead to new insights? To what extent 
does it help us understand problems like perceiving, comprehending, learning 
and transfer? How fruitful is the current metaphor for conceptualizing the 
problem of remembering (which is the problem with which it is most closely 
allied)? The present chapter simply explores these questions. In effect, it is a 
working paper written in the hopes that others will be prompted to improve 
upon the present arguments or to show how they are wrong. 


SECTION 1 


Memory versus the Processes by Which Past Experiences 
Affect Subsequent Interactions with the World 


Attempts to understand most psychological phenomena lead to a consideration 
of effects of previous experience on subsequent experiences and interactions 
with the environment. This appears to provide a prima facie case for a general 
concern with “memory.” However, is it necessarily the case that the effects of 
past experience are to be accounted for in terms of “memories” represented in 
the form of stored traces which must be contacted in order to mediate interpre- 
tations of subsequent events? We shall begin to examine this question by briefly 
discussing the problems of learning and perceiving and distinguishing them from 
the problem of remembering. The phenomena of learning and perceiving are 
obvious cases involving the effects of past experiences. We briefly discuss views 
of these phenomena that do not involve memory metaphor assumptions. From 
these views we shall eventually raise questions designed to prompt search for an 
altemative to the memory metaphor that may also be applicable to the phe- 
nomenon of remembering. 

First consider learning; for example, the problem of what it means to learn to 
play tennis. As Bartlett (1932) noted, one does not learn particular responses, 
nor connections between responses and stimulus situations (for example, con- 
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nections between a particular response and the stimulus of a tennis ball coming 
over the net and bouncing in the right-hand corner of the court). Previous 
experience allows the creation of more finely tuned “novel but appropriate” acts 
(e.g., see Bernstein, 1967; Greene, 1971; Turvey, Chapter 9, this volume). But 
note that questions about tennis playing ability are not equivalent to questions 
normally asked when remembering is studied in the laboratory. The type of 
question asked in memory experiments is more equivalent to “Try to reproduce 
those exact strokes that you just made” or “Describe the places where the ball 
hit during your last volley.” 

It seems possible that a tennis player, for example, might be relatively accurate 
at answering the latter questions. And it might be argued that the ability to do 
so depends on stored “memories” of previous experiences. But even granting 
(for the moment) this “memory” assumption, one surely would not wish to 
simply equate the ability to play tennis with a list of previous memories of 
tennis playing experiences. Such an equation seems as implausible as equating 
one’s “knowledge of his native language” with a list of memories of previously 
experienced sentences or even paraphrases of these sentences (e.g., see Chomsky, 
1965). One’s ability to remember will undoubtedly be related to the nature of 
the knowledge acquired through learning, but it would seem strained to equate 
the problem of learning with the problem of how memories are stored and 
retrieved. 

A further example of differences between problems of learning and memory 
stems from J. Gibson’s (1966) claim that perception does not depend on 
memory (see especially his Chapters 12 and 13). The present authors find that 
reactions to Gibson’s claim are frequently adamant, and include exclamations 
like “That’s ridiculous, it is obvious that perceptions depend on previous 
experiences one has had.” But Gibson’s statement does not deny that past ex- 
periences affect the organism. Instead, it differentiates the problem of learning 
from the problem of how one forms and stores “memories.” Processes by which 
past experiences influence current perceptions need not involve some mediated 
contact with previously stored traces. Resistance to Gibson’s statement fre- 
quently reveals our tacit assumptions that learning necessarily results in “mem- 
ories,” and that these stored memories are responsible for mediating new 
interactions with the world. 

As an alternative to reliance on “memories,” Gibson assumes that learning 
involves the attunement of the system to invariant information. “The develop- 
ment of this attunement, or the education of attention, depends on past 
experience but not on the storage of past experiences” (Gibson, 1966, p. 262). 
Gibson’s theory therefore stands in contrast to those theories of perception 
which assume that sensations must be enriched by contact with previously 
stored memory traces (and see Gibson & Gibson, 1955). For example, Gregory 
(1972) argues that “. . . perceptions are constructed by complex brain processes 
from fleeting fragmentary scraps of data signaled by the senses and drawn from 
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the brain’s memory banks... [p. 707] .” Gregory makes explicit his assumption 
that previous experiences result in the formation and storage of memories that 
must mediate and enrich subsequent interpretations of perceptual events. As 
Mace (this volume) notes, such “memory enrichments” are assumed to be 
responsible for determining the meaning of sensations. In short, many perceptual 
theorists relegate the problems of interpretation and meaning to memory and 
cognitive theorists. Yet many of the latter theorists do just the reverse. For 
example, Anderson and Bower (1973) state: 

Even the most abstract structures seem capable of being reduced to perceptual data 

.... The language of the mathematic grammarians seems at times to be little more than 

uninterpreted symbols, mere tiles to be shuffled about according to string formation 


tules. But real languages always remain close to their perceptual base in their interpreta- 
tion [p. 155; italics ours] . 


If one simultaneously holds both a sensation enrichment theory of perception 
and a position similar to that of Anderson and Bower (and such a combination 
seems rather typical), it becomes questionable whether, or how, meaning ever gets 
into the system. Gibson, by rejecting the assumption that stored memories are 
necessary for perception, avoids such troublesome buck-passing. His alternative 
of perceptual learning as attunement of the perceptual systems provides a 
different and possibly more fruitful perspective for examining the effects of 
previous experience on subsequent experience and performance. 


Past Experience as Setting the Stage 
for the Articulation of New Information 


What are the characteristics of learning, perceiving, and comprehending that raise 
questions about the fruitfulness of the memory metaphor? One such characteris- 
tic that seems primary is that these activities all intrinsically involve the articula- 
tion of novel information. The problem for the memory metaphor is that storage 
and retrieval of traces only deals with old, previously articulated information. 
Memory traces can perhaps provide a basis for dealing with the “sameness” of 
the present experience with previous experiences, but the memory metaphor has 
no mechanisms for dealing with novel information. This suggests that a different, 
more fruitful metaphor might be found. 

As a first step toward such an alternative, consider the possibility that a major 
role of past experience is to provide “boundary constraints” that set the stage 
for articulating the uniqueness as well as sameness of information. And contrast 
this “stage-setting” metaphor with the assumption that previous experience 
merely provides an accumulation of memories that are responsible for meaning- 
ful interpretations of subsequent events. In our opinion, the stage setting 
metaphor is both plausible and important. For example, Gibson and Gibson 
(1955) argue that perceptual learning involves a process of differentiation rather 
than enrichment by previously accumulated memories. The ability to differen- 
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tiate an input from others presupposes that one can detect its uniqueness 
(relative to a set of alternatives) as well as its sameness (i.e., the invariant 
information across past and present experiences). Experience with a set of 
items sets the stage by providing boundary constraints that specify relative 
invariances with respect to past experience, but the present experience goes 
beyond these boundaries. Unique aspects of the present situation are arti- 
culated as well. In short, the nature of one’s present experience depends 
upon previous experiences but cannot be reduced to memories of such experi- 
ences. Perception is direct given the attunement that previous experiences 
provide. As an example, consider someone who usually drives a pickup truck and 
then drives a car. The smoothness of the car’s ride is very noticeable. Past 
experience with pickup trucks appears to set the stage for articulating the 
distinctiveness of a car’s movement, but the perception of the latter cannot 
simply be reduced to previous memories of experiences of driving a truck. 

Note that the stage-setting metaphor assumes that meaningful recognition can 
be direct given the appropriate level of organismic attunement (and see Gibson, 
this volume). It is therefore assumed that meaningful recognition need not be 
mediated by “contact” with previous memories that were stored. As Gibson 
(1966) states: “Recognition does not have to be the successful matching of a 
new percept with the trace of an old one. If it did, novelty would have to be the 
failure to match a new percept with any trace of an old one after an exhaustive 
search of the memory store, and this is absurd [p. 278].” It seems useful to note 
that others (e.g., Norman, 1969) have also pointed toward the absurdity of 
assuming an exhaustive search of memory in order to know that something is 
novel (e.g., X is a novel word). Yet the “solution” to this problem has frequently 
been conceptualized within the framework of the memory metaphor, namely 
that one automatically knows “where to look” for the item and finds that there 
is nothing stored in that “place.” Such a solution may seem adequate for 
explaining how someone knows that he has no information about something. 
But how can such an explanation account for the fact that the attuned perceiver 
can frequently articulate the nature of the new information? Such an act 
involves something more than the mere realization that one has no information 
about event X. For example, how can the truck driver articulate the uniqueness 
of a car’s ride instead of merely perceiving that the experience is different from 
anything experienced before? (Assuming, of course, the truck driver never rode 
in a car before.) 

Even if one assumes that an organism can “find” a trace and determine that it 
is similar but not identical to the new experience, there still remains a problem 
of accounting for the possibility of recognizing the specific nature of the 
uniqueness. At some point, the direct perception of novel information seems to 
be a necessary assumption. The stage-setting metaphor attempts to make this 
assumption explicit. The role of past experience is viewed as establishing bound- 
ary constraints that attune the organism and allow the articulation of unique 
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aspects of a situation. Past experience is not viewed as simply providing memory 
traces that must be found and matched or compared to inputs. Put another way, 
meaningful perception is assumed to depend on how an input fits into an 
abstract framework (e.g., see Bransford & McCarrell, 1974) rather than how an 
input makes contact with previous memory traces that were stored. 


Linguistic Comprehension 
and the Uniqueness of Understood Meanings 


The stage-setting metaphor has implications for conceptualizing questions about 
linguistic comprehension as well as perception. For example, relatively short- 
term effects of past experience can be equated with the effects of context on 
comprehension. Elsewhere (e.g., Bransford, McCarrell, & Nitsch, 1976) it has 
been argued that the understood meaning or significance of utterances is 
uniquely specified as a function of the contextually induced attunement of the 
listener. The argument is that linguistic comprehension cannot simply be 
equated with “finding,” “selecting,” and “amalgamating” previous “meaning 
elements” that were stored (e.g., see Katz & Fodor, 1963). 

Note that finding, selecting, and amalgamating assumptions lie within the 
domain of the memory metaphor, where the meaning elements are basically 
stored traces. It is difficult to see how such assumptions imply anything other 
than the notion that novel understood meanings (significances) are simply due to 
novel combinations of old “meaning elements” that were stored. A memory 
metaphor approach to linguistic comprehension neglects the problems discussed 
in the preceding section. That is, at some point “old” “meaning elements” were 
“new” or “novel,” and the memory metaphor does not provide a basis for 
articulating new information. The stage-setting metaphor suggests that if one 
assumes the existence of “meaning elements,” then the significances of these 
individual “elements” also change as a function of organismic attunement. 
From a stage-setting perspective, the role of past experience is not simply to 
provide a repertoire of stored meanings (or senses) that can be retrieved and 
novelly recombined in terms of syntactic rule structures and selection restric- 
tions. Instead the role of past experience is to provide the organism with abstract 
tools that can be used to articulate novel significances that a speaker or writer 
intends. 

Consider some possible understood meanings or significances of a statement 
like “I’m going to drive to Minneapolis tonight.” If the speaker is in St. Paul, 
Minnesota, the statement if not surprising. If the speaker is in California, one 
realizes that the person is in for a long drive. And if the speaker is in Europe, the 
statement seems strange. Or consider reading a newspaper headline like 
“Peace finally comes to Europe.” What is the significance of this statement? If 
one is reading an old newspaper, it is a historical fact. If one is reading today’s 
paper, it is understood differently. And what is the significance of a phrase like 
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“today’s paper”? The understood significance of this utterance changes every 
day. 

It seems obvious that a listener’s knowledge of (or past experience with) his 
native language plays an extremely important role in determining the signifi- 
cance of statements. Yet the examples immediately above suggest it might be 
fruitful to regard such linguistic knowledge as tools that can be used to 
understand unique significances as a function of contextual attunement. From 
this peispective, comprehension cannot simply be equated with the retrieval, 
selection, and amalgamation of previously stored “meaning units.” A “stage- 
setting” approach to comprehension orients one toward the potential for 
increasing refinements and changes in understood meanings or significances, And 
these understood meanings are uniquely specified as a function of the listener’s 
level of contextually induced attunement. 

Consider for example, what is involved in helping someone experience an input 
in a particular manner. Assume that a speaker wants a listener to experience the 
utterence “go buy some” as being extremely funny. What would the speaker 
have to do to produce such an effect? Clearly, the speaker would have to create 
some contextual support that provided a framework for experiencing a certain 
significance of this utterance. The context or framework provides a level of 
attunement that sets the stage for the nature of the experience afforded by a 
linguistic event. Could one ever exhaust the potential significances of an 
utterance? This seems unlikely. It seems more plausible to assume that a clever 
speaker could always specify some framework that permitted a unique articula- 
tion of information that the listener never considered before. According to the 
memory metaphor, such unique understanding arises from the novel combina- 
tion of old meaning elements. The role of context would be to select from 
among those possible stored meanings in order to find ones that fit the selection 
restrictions of the linguistic strings (but see Barclay, Bransford, Franks, McCar- 
rell & Nitsch, 1974; Franks, 1974). Of course, further elaborations of specific 
information might arise from deductive inference. But the nature of one’s 
deductions is a fynction of how one initially comprehends the input information 
(e.g., see de Groot, 1965; Katona, 1940; Staudenmayer, 1975); hence, one must 
still face the problem of accounting for differential interpretations of individual 
“meaning elements” as a function of the contextually induced level of attune- 
ment of the comprehender. 

To further highlight the problems involved, consider the selection and amalga- 
mation metaphor in light of a problem of nonverbal comprehension. For 
example, imagine a gesture of an outstretched hand with all five fingers ex- 
tended. What might such a gesture mean? When performed in the context of 
questions like “Who wants to go with me?,” “How many children are there in 
your family?,” “How tall is your father?,” etc., it has three very different 
significances. It seems strained to assume that one has stored a set of “meanings” 
of an outstretched hand and simply selects among these as a function of context. 
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Information about hands provides constraints on the significances understood by 
the comprehender, but the understood meaning seems to be a function of the 
relationship between the hand and the abstract framework generated by context 
rather than a function of a “match” to a previous meaning that was stored. It 
seems plausible to suggest that what is called “linguistic meaning” should also be 
viewed as information that sets constraints on the potential significances of any 
utterance. But the understood meaning of an utterance cannot simply be 
reduced to the selection and amalgamation of such linguistic meanings. The basic 
problem of understanding involves grasping the significances of inputs, and 
significances are uniquely specified as a function of the relationship between a 
contextually induced framework and the particular input. The stage setting 
metaphor attempts to emphasize that understanding involves relationships; it is 
an activity rather than a thing. 


Summary of Section 1 


To summarize, we have suggested that notions of “memory” underlie usual 
assumptions about the effects of past experiences on subsequent interactions 
with the environment. Yet one can assume that past experiences affect subse- 
quent experiences without necessarily assuming that it is the accumulation of 
stored memories that underlie such effects. One alternative is that past experi- 
ence “sets the stage” for subsequent experiences. From this perspective, the 
perceived meaning of experiences is uniquely specified. It depends on past 
experiences but cannot be reduced to contact with “stored traces” of past 
experiences. A strength of the stage-setting metaphor is therefore that it em- 
phasizes the human capacity for appreciating unique aspects of information 
never explicitly articulated before." 

Early in the present section, it was also suggested that the problem of how past 
experiences affect subsequent perceptions is not necessarily equivalent to the 
problem of remembering as it is studied in most laboratory experiments. The 
key problem with a memory metaphor approach to perceiving and comprehend- 
ing was its inadequacy for dealing with the novel and unique aspects of 
experiencing. Remembering appears to be a phenomenon in which novelty and 
uniqueness are minimized whereas “sameness with” or “repetition” of previ- 
ously experienced information is maximized. It follows that the memory meta- 
phor may be more applicable to the remembering domain. 

It is important to note, however, that the stage setting approach emphasizes 
that perceiving and comprehending are activities whose function is to articulate 
the significance of inputs. Significances are assumed to be uniquely specified as a 


"Our discussion of the stage-setting metaphor is necessarily cursory. We hope to develop it 
further in subsequent papers. In particular, we hope to contrast it with conceptualizations 
of “semantic memory.” The latter still fall within our definition of the memory metaphor. 
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function of the relationship between an input and some abstract framework or 
set of boundary conditions. The processes of understanding or “knowing” are 
therefore viewed as the generation of a particular significance as a function of 
organismic attunement. Perhaps the act of remembering also involves the genera- 
tion or creation of a certain experience that has a particular significance (that is, 
that experience X is a “memory”). If so, then it is the attunement constraints on 
creation that a theory of remembering must attempt to explain. The problem of 
remembering is the major concern of the remainder of this chapter. The next 
section explores questions about the fruitfulness of the “storage of memories” 
metaphor for providing insights into the problem of remembering. We then 
attempt to look at problems of remembering from a re-creation point of view. 


SECTION 2 


Remembering and the Memory Metaphor 


Most theories of memory are based on studies of remembering. And most studies 
of remembering involve an important constraint on the nature of the questions 
asked of subjects; namely, the subjects are asked whether certain inputs occurred 
in the context of a particular memory task. For example, in experiments on 
recognition memory, subjects are not asked to indicate whether they have ever 
experienced the test items; they are asked whether certain items were experi- 
enced in the context of that experiment. 

Within the memory metaphor, such context problems are usually handled by 
greatly increasing the amount and nature of the information assumed to be 
stored with each trace. For example, traces are assumed to have “list tags,” 
“time tags,” and “experiment tags” attached to them which code the necessary 
contextual information (e.g., see Anderson & Bower, 1972, 1973). Tagging 
approaches have been described by Watkins and Tulving (1975): 


According to tagging theory, remembering a word entails attaching an occurrence 
marker or tag to its representation in a permanent memory system, This representation 
is frequently conceptualized as a node in a semantic network. Recall entails gaining 
access to the node and deciding about the presence of an appropriate tag [p. 5]% 


A tagging approach does not seem fruitful for explaining contextual effects on 
comprehension (e.g., see Bransford & Johnson, 1973; Bransford & McCarrell, 
1974; Bransford, McCarrell, & Nitsch, 1976). For example, the sentence The 
haystack was important because the cloth ripped results in very different 
experiences depending on whether it occurs in the context of parachute versus 
house. It is not the “same experience” distinguished only by a contextual tag. 
Similarly, an utterance like Bill has a red car is innocuous and easily understood 
in the context of an experiment where it is presented as an example of 
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something to be remembered, rated, or analyzed. But try walking up to some- 
one, simply say “Bill has a red car” and see what happens. Generally the listener 
will be extremely perplexed. He will know what was said but not what was 
meant (e.g., see Bransford, McCarrell, & Nitsch, 1976). Again, the same 
utterance in different contexts is not “that same utterance” plus the mere 
addition of a context tag.” 

Of course, arguments about the effects of context on perceiving and under- 
standing need not affect one’s assumptions about memory. Anderson and Bower 
(1973), for example, relegate the former problems to HAM’s linguistic and 
perceptual parsers. For HAM, issues in perceiving and understanding are divorced 
from the problem of memory per se. In contrast, other theorists (e.g., Bartlett, 
1932; Kohler, 1947) argue that the processes underlying remembering are similar 
to those underlying activities such as perceiving and thinking. The present 
chapter attempts to support the latter point of view. As part of the memory 
metaphor, tagging conceptions imply that remembering depends on first finding 
an item and then finding a tag that specifies its context of occurrence (for 
example, that word X occurred in a particular experiment). An alternative more 
congruent with a stage-setting metaphor is that remembering does not involve 
“finding” stored items plus tags, but instead involves a re-creation of previous 
experiences. From this perspective, some degree of contextual information for 
stage setting may be a prerequisite for re-creating previous acts. For example, 
consider once again the task of helping a listener experience the utterance ‘‘Go 
buy some” as being extremely funny. To do so, the speaker would have to create 
some contextual support that provided a framework for experiencing a certain 
significance of this utterance. The context or framework would therefore pro- 
vide a level of attunement that constrains the nature of the experiences afforded 
by this linguistic event. There may exist similar attunement constraints on 
reexperiencing (that is, remembering). If so, then contextual information is not 
something simply appended to stored traces and searched for in order to 
produce remembering. Instead, some degree of contextual attunement could be 
a prerequisite for re-creating previous experiences. 

In the next section, we will discuss some general attunement constraints on 
remembering. Our purpose is to raise questions about the adequacy of equating 
context effects with the storage of tags that are appended to traces. In sub- 
sequent sections we more closely question the assumption that prerequisite 
contextual information merely affects how one comprehends test items, and 


? Anderson and Bower (1973, p. 348) note that a sentence like The rock crushed the hut 
must be stored as something like “In Context C, I studied that ‘The rock crushed the hut’.” 
The assumption seems to be that, if spoken in a different context, the utterance would be 
the same and only the context tag would be different. However, an utterance like Bill has a 
red car is understood very differently when presented in the context of an experiment 
versus a social context (for example). The information necessary to understand and hence 


the processes of comprehending differ as a function of context. 
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that memory per se only deals with the outputs of this comprehension process 
which are then used to “direct one’s search” for stored items. Imbedded within 
the memory metaphor are notions not only of contextual tagging but also search 
for traces and tags. These “search” aspects of the memory metaphor will also be 
questioned. 


Context and Remembering: 
General Attunement Constraints on Reexperiencing 


Consider some simple “experiments” conducted by the present authors. These 
illustrate some of our concerns with the importance of stage setting, that is, 
general attunement constraints, on the act of re-creating previous experiences. In 
the first experiment, the experimenter approached a group of people in a 
seminar and stated: “This is a recognition experiment. I want you to tell me 
whether you have heard these words before.” Then the experimenter added: 
“That is, heard them between 1:00 and 1:15 yesterday.” And the experimenter 
immediately began reading the words. Students violently objected to this proce- 
dure. A frequent comment was “Wait a minute, I don’t know where I was.” 
Such information was considered a prerequisite to making any recognition 
decisions. Of course, in most experimental situations subjects have no problems 
specifying the relevant contextual setting. The nature of the task and instruc- 
tions specify the setting in advance. 

Why the violent objections on the part of the subjects? Why did they first have 
to determine “where they were”? Did they need to know where they were in 
order to specify the appropriate “context tags” to check for? If so, what would 
these tags look like? Would they say “experienced at 1:03 on Monday, October 
2, 1974”? Would they say “list number 6,405, item number 3” (assuming that 
there is something equivalent to “list tags” in real world experiencing)? And if 
they recognized a particular input, could they then “read off” a host of tags 
indicating other previous experiences with that item? It seems more reasonable 
to assume that the act of re-creating “where one was” was actually a prerequisite 
for determining the nature of the experiences afforded by the presentation of 
the test words. 

A second “experiment,” a thought experiment, further illustrates the impor- 
tance of considering the hypothetical process of recreating a perspective or level of 
attunement (i.e., Bartlett’s “attitude”) that sets constraints on the nature of the 
experiences afforded by various probes. Assume that one wishes to study 
relatively long-term memory; for example, memory after one month. The typical 
way to conduct such a study would be to bring the subjects into a laboratory, 
present them with some inputs (e.g., 40 words) and perhaps test them (e.g., on 
recognition memory). One would then dismiss the subjects without further 
mention of a later test. After one month, the subjects would be presented with 
the 40 test words (plus foils) and asked to recognize those heard earlier. Even 
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after one month, the subjects would probably perform better than chance in this 
type of task. And each recognition decision could probably be made within a 
relatively short period of time. 

Now contrast this condition with another one: The subjects are simply 
brought into the laboratory, read the same 40 target words, and asked to 
indicate whether they can re-create a particular time that they had experienced 
each word any time during the last month. If one actually attempts to re-create a 
particular episode where a word was heard (rather than simply guess on the basis 
of assumed probability of occurrence), this task is very difficult. Intuitively, it 
takes much longer to make a decision under these latter conditions than it does 
in the typical memory experiment noted above. But if one assumes that the 
recognition decisions are based on finding context tags attached to items, it 
seems strange that the task of finding any tag within the last month (Condition 
2) is more difficult than finding a particular tag that occurred almost 4 weeks 
earlier (Condition 1). From the present perspective, the subjects in Condition 1 
have an advantage. They know they were in an earlier experiment, and that the 
task is relevant to just that context. Knowledge of their contextual setting 
therefore provides an abstract level of attunement that constrains the experi- 
ences afforded by the nature of the probes. Of course, the presentation of 
probes should further focus or articulate this general level of attunement. This 
notion is discussed later in the chapter. 

Consider a third “experiment” related to the present discussion. An experi- 
menter and a friend were sitting in the latter’s living room. The experimenter 
pointed to a brass table and said, “Tell me what it makes you think of.” The 
friend’s responses were “Brass, India, round, pretty, etc.” The experimenter then 
made up some excuse for the question and the conversation turned to other 
topics. Somewhat later, the experimenter said, “Try to remember what you did 
last Christmas.” The experimenter then pointed to the brass table and said, “Tell 
me what it makes you think of.” This time the friend’s responses were very 
different, The friend said, “Oh yes, I remember when you were so mysterious in 
the store where we were Christmas shopping and you wouldn’t let me see what 
you bought. Then I remember opening this large package on Christmas morning 
and finding the brass table,” etc. In short, the table now acted as a prompt to 
allow the friend to re-create previous, personal experiences. But these experiences 
weren’t re-created every time the friend looked at the brass table. However, given 
a certain “attitude” (cf. Bartlett) or “level of attunement” (e.g., “What hap- 
pened last Christmas?”), the table acted as an effective “retrieval” cue. 

It is ecologically adaptive that familiar objects do not automatically remind us 
of previous experiences. If they did, our minds would be deluged with past 
experiences. Every time we saw our car, ora friend, or our office coffee cup, or 
heard a recently experienced word, we would reexperience those situations in 
which those entities have been previously encountered. This would be a cumber- 
some state of affairs. On the other hand, events do sometimes seem to sponta- 


16. TOWARD UNEXPLAINING MEMORY 443 


neously remind one of previous experiences. Certain “memories” seem to appear 
from “out of the blue.” Yet it is not clear that the phrase “out of the blue” is 
really appropriate. For example, a colleague has remarked that faces of people 
start to look familiar (that is, remind him of colleagues from around the 
country) whenever he is anticipating going to a psychological convention and 
seeing old friends. This hints that stage setting may be behind these memories 
“out of the blue.” In other cases, it often seems that during a conversation 
certain comments remind one of particular facts or stories. But it is not 
necessarily the case that these same comments, if spoken in the context of 
different conversations, would produce the same results. For example, one of us 
was conversing with a colleague and said: “It’s such a beautiful day. I feel like 
simply sitting on the sidewalk and watching people. Maybe TIl even set up a 
stand and meet people by selling lemonade.” Immediately after this comment 
the colleague was asked: “Does my choice of words remind you of anything?” 
The colleague’s initial answer was “no.” Then, after a relatively long pause, she 
said: “Do you mean the stimulus trigram used in our studies of recognition 
memory?” This trigram (that is, enterprise: sidewalk, lemonade), in fact, consti- 
tuted the code name used to refer to these recognition studies. And more than 
five such studies had recently been conducted. Despite this fact, the words 
sidewalk and lemonade had not automatically prompted the recreation of this 
information given the conversational context that occurred. 

The preceding situations seem to exemplify some strong constraints on any 
adequate theory of remembering. Such a theory must not only explain how 
internally generated or explicitly presented cues do help one recreate previous 
experiences, but it must explain why potential cues do not always prompt one 
to remember. As Bergson (1911) notes in his Creative Evolution: “We trail 
behind us, unawares, the whole of our past; but our memory pours into the 
present only the odd recollection or two that in some way complete our present 
situation [p. 167].” It seems that any adequate characterization of one’s 
“present situation” must ultimately include an account of the global level of 
“attunement” set by the context of the situation. This level of attunement will 
affect the nature of the experiences afforded by potential cues. Sometimes, of 
course, one may recognize that something is merely familiar (for example, a 
song, a face, a statement). In such cases one may attempt to recreate more 
information about the situation, for example, where or when the input was 
experienced before. In everyday situations we therefore do not always have to 
recreate the exact context in which an input was originally embedded in order to 
feel that it was previously experienced. Yet the above discussion suggests that 
even such feelings of familiarity do not necessarily come from Ben: of the pues 
They appear to depend on the organism’s momentary state of “attunement. 3 

To summarize this section, we have argued that some level of “attunement 
seems to set the stage for remembering. Attunement is a neglected aspect of 
remembering for the literature that approaches remembering from the perspec- 
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tive of the memory metaphor. The stage-setting metaphor stresses the impor- 
tance of further elucidating the nature of this prerequisite attunement. However, 
mere recognition of the potential importance of prerequisite stage setting for the 
overall problem of remembering would not constitute critical evidence contrary 
to the memory metaphor. It is doubtful, for example, that anyone would deny 
the importance of comprehending test items as well as acquisition items in a 
memory task. One could still remain within the memory metaphor by subscrib- 
ing to some variation of the tack taken by Anderson and Bower (1973). For 
them, such stage-setting factors are presumably relegated to initial linguistic and 
perceptual parsers. Given this assumption, the problem of remembering begins 
where the parsers stop; that is, with some comprehended or encoded representa- 
tion of the fest items. One can remain within the memory metaphor by assuming 
that it is the test-item-as-comprehended that “directs one’s search” for stored 
traces. The next section examines these search assumptions. Its purpose is to 
raise questions regarding their fruitfulness for understanding remembering. 


Search and the Memory Metaphor: 
What Is Presupposed? 


We have argued that contextual constraints on remembering might be incorpo- 
rated into the memory metaphor by elaborating the nature of the comprehended 
test items. The comprehended test item then directs one’s search for tagged 
traces. Attention must therefore be focused on the “search” aspects of the 
memory metaphor. 

Consider how the test-item-as-comprehended affects memory; for example, 
recognition memory. A number of theorists have shown that test context affects 
recognition (e.g., Bobrow & Light; cf. Bower, 1972). Light and Carter-Sobell 
(1970), for example, presented the subjects with acquisition sentences like She 
likes strawberry jam. During recognition, the subjects were presented with 
phrases such as blueberry jam versus traffic jam and asked to indicate whether 
they had heard the last word in each phrase (e.g., jam) during the acquisition 
task. Those subjects presented with the test phrase blueberry jam exhibited 
better recognition performance than subjects presented with traffic jam. In 
short, context affected subjects performance even in a recognition task. Fre- 
quently, the account of such results assumes that comprehension involves the 
“tagging” of a particular meaning or sense of a lexical item, and that remember- 
ing involves a process of “finding” the appropriate meaning trace plus tag (e.8., 
see Reder, Anderson, & Bjork, 1974). For example, the test phrase traffic jam 
may direct one to search for a meaning of jam that is different from the meaning 
heard during acquisition. One therefore “looks” at the wrong meaning and fails 
to find the appropriate tag. The cue blueberry jam, on the other hand, may more 
probably lead one to the originally expressed meaning of jam; hence one is more 
likely to find the appropriate tag. In short, the context in which a test item is 
embedded appears to influence one’s interpretation of that item, and that 
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interpretation affects where one searches or looks. Put more generally, knowl- 
edge of what one is looking for influences the nature of one’s search processes. 
This seems to be a basic assumption underlying “search” portions of the 
memory metaphor. Even if one assumes that “recognition is automatic,” often 
behind this view is the further assumption that one automatically knows what 
one is looking for and “where to look” for this stored item (or stored meaning) 
plus tag. 

Consider what the notion of search means when instantiated in everyday 
activities. Assume that someone is looking for a pair of socks. He will obviously 
have more success if he has a separate place for his underwear, shirts, pants, 
coats, etc., than if all of his clothes are simply stored in one clump. In the 
former case, the person can simply restrict his search to the underwear drawer. 
And if he is searching for a screwdriver rather than a pair of socks, it is unlikely 
that he will look in his underwear drawer. His search processes will be modified 
as a function of the likely places in which articles will be stored. 

But note that these efficient search strategies presuppose knowledge about 
where to search for items and what items one is searching for. For example, one 
must already know that his clothes are stored in separate categories in order to 
avoid searching for socks in the coat closet. And one must already know the 
likely places in which he has stored a screwdriver in order to avoid searching for 
it in his underwear drawer. Where do such knowledge constraints come from ina 
memory experiment? For example, how could recognition or retrieval probes 
direct one’s search processes or tell him “where to look?” 

In everyday situations, some search problems are handled by special filing 
systems that tell one where to look for items. A library cataloging system is an 
example. Some theorists have proposed that human memory works in an 
analogous manner (e.g., memory is “content addressable”). An obvious problem 
with this analogy is that the library system only works due to the comprehen- 
sion processes of its users. Within the memory metaphor, the comparable 
comprehending homunuculus has been relegated to the initial parsers. If one 
follows the analogy, the domain of the memory metaphor must therefore be 
restricted to the cataloging system itself without a comprehending user. Or it 
must at least be restricted to a user who rather automatically follows some 
prearranged search strategy without becoming involved in additional comprehen- 
sion processes along the route that alter the prearranged search. To be cogent, it 
appears that the memory metaphor must assume that memory search becomes 
essentially algorithmic once given initial comprehension of the test item. What, 
however, specifies the search algorithms? Is there one algorithm, or a fixed set of 
algorithms? Must subjects store search algorithms in addition to memory traces 
plus tags? Would the account merely replace the question of how one finds a 
trace with the question of how one finds a search algorithm? i 

We reiterate that theories of direct access (or automatic recognition) still seem 
to presuppose that one “finds” a stored trace or memory. Assumptions of direct 
access do not obviate the need to specify the mechanisms responsible for such 
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feats. However, a number of theorists (e.g., Mohs, Wescourt, & Atkinson, 1975) 
have argued against notions of direct access. As an alternative, they suggest that 
inputs are stored within structures, and that it is the organization of these stored 
structures which determines the effectiveness (in terms of probability correct or 
speed) of search. For example, a frequent assumption is that items may be 
stored together, as in a categorical or hierarchical manner. One can then restrict 
his search to some category or set of categories, and the smaller the set size or 
category the greater the speed and probability of effective search. In our 
opinion, such theories still presuppose some notion of “direct access.” The 
access is now to a set, category or conceptual entity rather than the target item. 
Such theories make the further assumption, however, that once one “finds” the 
appropriate structure, search is determined by the particular structure that was 
stored. 

It seems useful to raise some questions about hypothesized “stored structures” 
and about access to items in such structures. For example, if subjects have stored 
a set of inputs in a particular structure, must they then invariably search through 
that structure in a particular manner? Is search again algorithmic? If there are 
optional ways to gain access to elements in the structure, then what determines 
the ease of access? This latter question becomes equivalent to questions about 
how probes (e.g., as in recognition or cued recall) facilitate remembering. More 
particularly, the issue concerns constraints on the effectiveness of various 
probes. For example, one constraint seems to be that there are an “optimal 
number of items” per category or cue (e.g., Mandler, 1967). 

Consider a study by Bransford and Nitsch (in preparation). Its purpose was to 
instantiate some intuitively obvious constraints on remembering. In particular, 
the study attempted to show that: (1) certain hypothesized structures of storage 
may “explain” one type of memory performance (e.g., free recall) but fail to 
explain another type (e.g., cued recall); (2) various cued recall probes permit 
subjects to “gain access” to different parts of hypothesized stored structures, 
hence there are variable points of “entry” into such structures; (3) there are 
definite constraints on cue effectiveness; in particular, constraints on the number 
of items that can be “associated” with a particular cue. Potential implications of 
the study are discussed below. 

The Bransford-Nitsch study consisted of reading 28 acquisition sentences to 
three different groups of subjects. All subjects heard the same predicates (tuning 
a violin, catching a ball, etc.). Only the subject nouns were changed. Group 28 
heard 28 distinct subject nouns. Group 7 heard seven different subject nouns 
with four predicates for each subject noun (that is, they heard four sentences 
about a boy, four about a policeman, four about a truck, etc., and these were 
blocked during acquisition). For Group 1 all 28 sentences were about a single 
topic, that is, a boy. All subjects were asked to image each sentence during 
acquisition (for example, “image a boy tuning a violin”). Following acquisition 
they were asked to free recall. 


16. TOWARD UNEXPLAINING MEMORY 447 


Free recall results indicated that Group 7 was superior to Groups 28 and 1. 
Group 1 exhibited the lowest recall. Such results parallel what would be 
expected given assumptions about benefits of categorization plus assumptions 
about the optimal number of items per category. For example, subjects in Group 
7 could be expected to have “stored” the sentences in seven different cate- 
gories and to have searched these categories. And indeed, their recall proto- 
cals were clustered in terms of the categories they had heard. But the subjects in 
Group 1 heard too many members of a single category, hence their ability to 
recall was limited. 

Following free recall, all subjects were given probes to aid them in retrieval. 
One-half of the subjects in each group received the subject nouns as retrieval 
cues and the others received the object nouns. Under subject noun cue condi- 
tions, the subjects in Group 28 exhibited the best performance. Group 7 subjects 
were second and Group 1 subjects were the worst. In fact, Group 1 showed no 
improvement between free and cued recall.? 

Now consider the results from subjects given the object nouns as retrieval cues. 
These results (as were those for subject noun cues) were scored only in terms of 
correct verb recall in order to compensate for the different numbers of subject 
nouns each group heard. And verbs were picked so as to minimize accurate 
guesses on the basis of general knowledge. For example, besides tuning a violin, 
one could play, buy, drop, carry a violin, etc. The number of incorrect verbs 
recalled was, in fact, less than 1%. Under object-cueing conditions, all three 
groups exhibited equivalent performances, and averaged 25 of 28 sentences 
correctly recalled. These results (especially considering Group 1) indicate that 
the problem of the optimal number of items per category is not a limitation on 
storage. Could it be a limitation on search? 

Assume that the input sentences were stored in different organizational struc- 
tures for the three different groups of subjects: Group 28 subjects “stored” each 
sentence individually, those in Group 7 stored them in terms of 7 different 
categories, and Group 1 subjects stored them all under one category. How did 
the presentation of retrieval cues help the subjects augment their recall? For 
example, when the subjects received the object nouns as cues, did they then 


3 Superficially, there is a flaw in the comparison of cued vs. free recall scores for Group 1 
subjects. Cued recall should not exceed free recall if, during free recall, all the cues SPUR 
presented during cued recall were accessible to the subjects. For Group 1 subjects, A 
explicitly presented cue(s) (that is, boy) were undoubtedly accessible during free recall. 
Note, however, that the study still permits the conclusion that there is a definite constraint 
on cue effectiveness. When all sentences are cued by the same cue (Group 1), cued recall is 
inferior to Groups 7 and 28. There is also a different way to view these data. Suppose one 
looks at the ease of recalling a particular target sentence (e.g., “a boy tuning a violin ) as a 
function of the list context in which it is embedded. The cue “boy” is much more N 
for recreating the target sentence if the subjects have heard only one sentence pose od 
(Group 28) than if they heard all sentences about a boy (Group 1). Intralist cue effective: 
ness is a function of the total set to which an input belongs (Garner, 1974). 
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have to search through their various stored organizational structures in order to 
“find” the proposition containing the object noun? If they did, then why 
couldn’t they simply search and find that proposition under other recall condi- 
tions as well? It was assumed that the subjects in Group 7 stored the sentences in 
terms of seven categories organized by subject nouns and searched through these in 
free recall. Did these subjects have to search through these same categories when 
given the object nouns as cues? This seems highly unlikely given that their recall 
cued by object nouns was superior to recall cued by subject nouns. Why should 
subjects be able to effectively search for items containing the appropriate object 
nouns, yet fail to search as effectively for items containing the subject nouns? 
Similar arguments can be made about the performance of Group 1 subjects: If 
the subjects had to search to find a proposition containing a particular object 
noun, why couldn’t they search just as well when given the subject noun as a 
cue? 

The above results suggest that the presentation of cues allowed the subjects to 
somehow “modify their search processes.” But it has already been noted that 
the ability to “know where to look” presupposes some knowledge of “where 
things are likely to be stored.” In addition, if one postulates the existence of 
certain “stored organizational structures” on the basis of free recall performance 
then it would appear that the presentation of cues allowed the subjects to 
somehow “bypass” such structures. Perhaps the notion that inputs are “orga- 
nized and stored together” is also an unfruitful metaphor. One might alterna- 
tively view performance in free recall as depending on the implicit generation of 
retrieval cues (and see Tulving, 1974). The organization postulated on the basis 
of free-recall performance would therefore not necessarily reflect the “structure 
of storage” as much as it reflects the availability of the implicit cues that 
subjects themselves use.* 

Of course, the possibility that free recall is essentially an implicit cued-recall 
task does not solve the problem of how retrieval cues operate. If one doubts the 
assumption that cues or probes automatically tell one “where to look” for 
stored traces (plus appropriate tags) then what alternative metaphors are left to 
pursue? And which metaphors can gracefully incorporate obvious facts like 
those illustrated in the Bransford and Nitsch study; for example, the fact that a 
probe like “A boy” is much more effective for helping subjects remember a 
sentence like A boy tuning a violin if the subjects have heard only one rather 
than 28 sentences about “a boy.” Must one postulate a limit on the number of 
“addresses” (indicating “where to look”) that can be “stored” with any poten- 
tial cue? 


We do not mean to suggest that remembering does not depend on initial encoding (for 
example, on what might be called initial “storage”). Initial encoding obviously influences 
the effectiveness of retrieval cues. 
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One theoretical possibility that is consistent with the memory metaphor might 
be that all propositions about a concept (e.g., about a boy) are attached to the 
same concept node in a memory network (e.g., see Anderson, 1975; Anderson & 
Bower, 1973). The more facts one learns about a concept, the harder it should 
be to retrieve old facts. In addition, “the search for a target proposition is 
subject to a cut-off time after which searching will cease” (Anderson, 1975, p. 
250). This model still assumes content addressability of individual elements (for 
example, direct access to the “boy” node). It must also assume direct access to 
the object nouns in order to account for the effectiveness of those items as 
retrieval cues, In addition, the model postulates a limitation on search. Later, we 
discuss data that appear to raise doubts about the search limitation explanations 
of the Anderson and Bower model. First, however, we attempt to recast the 
problem of remembering into a “re-creation” metaphor rather than a “search for 
stored traces” metaphor. In the process, we attempt to redefine intuitively 
important concepts such as “direct access,” “retrieval,” “search,” etc. For 
example, search processes are assumed to involve a gradual refinement of one’s 
level of attunement in an attempt to re-create a particular experience. Such 
processes are therefore assumed to be more like those involved in creating a 
particular action (e.g., see Bernstein, 1967; Greene, 1971; Hayek, 1952, 1969; 
Turvey, this volume) than they are like a search for some thing stored in a 
particular place. 


SECTION 3 


Toward a Re-Creation Metaphor of Remembering 


Consider Bartlett’s (1932) claim that the act of remembering is related to 
thinking (and see Neisser, 1967). To what extent might it be fruitful to view 
remembering in this way? In thinking, one may experience a general feeling for 
the abstract form of some endeavor. One may begin with a vague “feeling of 
knowing” that some argument is erroneous or that she or he has some poten- 
tially fruitful idea. Only gradually does this abstract framework become articu- 
lated or focused. Finally it may reacha level sufficient for specifying particular, 
concrete ideas about what should be done. | 

In a similar manner, the process of remembering often seems to begin from an 
abstract “attitude” (cf. Bartlett, 1932) or “level of attunement.” But this needs 
to be focused or articulated in order to achieve sufficient resolution to create a 
conscious experience that we call “remembering events.” Sometimes, of course, 
one has a mere feeling of familiarity (e.g., what does that remind ae of?) or 
“feeling of knowing” (¢.g., “I’m sure I could recognize it if I heard it”; see Hart, 
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1965). Given a particular “attitude” or “level of attunement,” the presentation 
of a probe (e.g., a “retrieval cue”) can help one achieve further articulations. 
Phenomenally, the presentation of a probe frequently results in an “aha” 
experience, especially if one has been trying to recall information and then 
receives an appropriate probe. What processes underlie these “aha” experiences? 
More generally, what processes underly our abilities to re-create those experi- 
ences that we call remembering events? 

Consider the possibility of viewing “retrieval” (re-creation) processes from the 
perspective of direct resonance. The simplest example of resonance is one 
involving tuning forks. If one strikes a “C” fork, another “C” fork in the 
vacinity will resonate to the vibrations that the former fork creates. But a 
resonance theory of remembering must clearly go beyond a mere “tuning fork” 
analogy. First, resonance must be based on semantic “resemblances” rather than 
merely on acoustic properties. Second, it seems fruitful to avoid the notion that 
one simply stores discrete “tuning forks” (that is, traces) that “await” the 
resonance process. It is obvious that previous experiences will “prime” or “tune” 
the organism, but the nature of this tuning may necessitate assumptions other 
than those about relatively independent traces that must be “found” in spatially 
discrete locations. Indeed, the heart of any resonance theory must involve the 
notion of attunement that sets constraints on the resonance processes. Without 
considerations of attunement, the resonance metaphor would quickly degenerate 
into a variant of the assumption that probes automatically tell one “where to 
look” for items that were previously stored. 

As an introduction to attunement constraints on resonance, consider the 
Bransford and Nitsch study discussed earlier. Here it was argued that the 
presentation of probes seemingly allowed the subjects to “bypass” hypothetical 
“stored organizational structures.” In fact, it was argued that assumptions about 
the organization of “what was stored” (based on free recall data) may more 
accurately reflect the accessibility of implicitly generated cues that it reflects 
the form of “spatially organized storage” per se. In addition, the study docu- 
mented an intuitively obvious constraint on cue effectiveness: the cue “a boy” 
was much more effective for helping the subjects re-create previous experiences 
(for example, re-create an experience of hearing A boy tuning a violin) if the 
subjects had heard only a single sentence about “a boy” rather than 28 sentences 
about “a boy.” What is the nature of this constraint on resonance? Why is there 
a limit on the optimal number of items per category or cue? 

Consider the possibility that a probe must allow one to further focus or 
articulate his present level of attunement. As noted earlier, this problem can be 
seen as analogous to having a general feeling for an idea but needing further 
information in order to articulate some concrete, conscious experience. A similar 
process could underlie the problem of re-creating previous experiences; that is, of 
remembering events. For example, given 28 sentences with different subject 
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nouns, a cue like “boy” would be sufficiently unique to allow the necessary 
degree of resolution to re-create a previous experience. Given 28 sentences 
beginning with “a boy,” however, presentation of the cue “boy” would essen- 
tially be ambiguous. It would be analogous to presenting the subjects with a cue 
like “you heard something” when they had heard 28 different subject nouns 
during the acquisition task. 

Assume, therefore, that some degree of focused resolution is a prerequisite for 
re-creating previous experiences. This would be equivalent to assuming that a 
probe must specify unique aspects of some experience, or that a probe must 
allow one to unambiguously differentiate among previous experiences he has 
had. Of course, aspects of the problem of differentiation are well known (e.g., 
see E. J. Gibson, 1969; Saltz, 1971). An emphasis on differentiation nevertheless 
seems especially important when considering factors that influence the effective- 
ness of retrieval cues. 

Consider the problem of differentiation or uniqueness from a slightly different 
perspective. At Vanderbilt, Barry Stein has conducted a number of retrieval 
studies using similes as target stimuli. His studies extend previous work on 
similes that J. G. Nesbitt (1974) began. In Stein’s studies, typical similes might 
be A pin is like a nail; A bumper is like a statue; A pliers is like a crab, etc. 
During acquisition, the subjects in Group 1 were simply asked to indicate the 
ease with which they could “solve” each of a list of 24 similes. Following 
acquisition, they were read the subject noun and asked to recall the second 
member of each pair. Cued recall reached an overall level of 59% accuracy. Given 
that the subjects indicated they had solved a simile, cued recall reached a level of 
78% accuracy. For those similes that were impossible to solve during acquisition 
cued recall was only 35%. 

Consider a second group run in Stein’s study. These subjects heard the same 
similes in the same order. During acquisition, however, they were asked to rate 
each simile on a “soft-hard” dimension rather than an “easy—difficult” (to 
solve) dimension. That is, they were to rate whether the two items in each simile 
were relatively the same or different on a hardness-softness scale. Following 
acquisition, the subjects in Group 2 were given the subject nouns as retrieval 
cues. The procedure was identical to that used in Group 1. Gor Group 2 
subjects, overall cued recall performance dropped to a level of 21% accuracy. 
Performance reached a level of 36% accuracy given that the subjects had decided 
that two items were the same on the soft-hard dimension. Acquisition ratings of 
“different” resulted in a cued-recall rate of 14%. In short, Group 2 subjects 
performed far below Group 1 subjects. It is instructive to note that in subsequent 
studies Stein found no differences between Groups 1 and 2 on recognition for 
individual subject and object nouns. The groups differed slightly in their abilities 
to recognize appropriate versus inappropriate pairings of previous acquisition 
items, but the differences were not nearly large enough to account for the 
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magnitude of the differences in cued recall. What might account for this large 
decrement in Group 2’s cued recall? 

Note that the “retrieval” problem in Stein’s study is not one of “finding” the 
“stored stimulus item” given a retrieval probe. This is not where group differ- 
ences occur. Instead the problem is one of being able to “follow the connection” 
between the first and second member of each pair. And there was only one 
“connection” attached to each individual subject noun.° Does the ability to 
remember depend on “finding” or “following” connections, or on something 
else? 

Consider the initial experience of Group 1 subjects. For them, each simile was 
presumably solved in a relatively unique manner. For example, to solve a simile 
such as A pin is like a nail, one must interpret “pin” in a particular manner (e.g., 
one may focus on the fact that it is long and pointed). Given the probe “a pin,” 
it seems reasonable to assume that subjects must re-create their original experi- 
ences (which, for example, would include focusing on a pin’s pointedness, etc.). 
This latter information would then uniquely specify another experience involv- 
ing pointedness (i.e., a nail), hence cued recall would be relatively good.° 

Contrast this situation with that found under Group 2 conditions. Here, the 
experiences generated by a probe like “a pin” would involve a respecification of 
its relative hardness. But a host of other items on the list were thought of in 
terms of their hardness, hence this information would not be sufficient to 
provide a focused resonance for re-creating the appropriate item from each pair. 
Viewed from this perspective, Group 2 in the Stein study is functionally 
equivalent to Group 3 in the Bransford and Nitsch study (where subjects heard all 
sentences about “a boy” and received “boy” as a retrieval cue). In the Stein 
study, the subjects have essentially been provided with the same functional cues 
for a large number of different events. 

An analog to Group 2 of the Stein study might be to present a list of animal 
similes; for example, A whale is like a rabbit (because they both breathe), A 
horse is like a spider (because they both move), A gnat is like a bull (because 
they both eat), etc. Intuitively, such a list should be extremely difficult to 
remember. In fact, we find that given the subject nouns as retrieval cues, the 
subjects average less than 4 items correct out of 12. But the problem is not 
simply due to the fact that all “stimulus” and “response” terms are animals. 
These same animal words can be re-paired so that the cued recall performance 
will be excellent. For example, consider pairs like A whale is like an elephant; A 


* Because there is only one proposition per subject noun, Stein’s results are not easily 
handled by simply postulating some limit on search due to too many propositions attached 
to a single noun (e.g., see Anderson, 1975). 

‘Experiments III and IV in Barclay, Bransford, Franks, McCarrell, and Nitsch (1974) 
document the importance of re-creating one’s initial acquisition perspective in order for a 
retrieval cue to be maximally effective. 
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horse is like a zebra; A gnat is like a mosquito, etc. Under these pairing 
conditions, our subjects averaged 11.5 (of 12) items in cued recall. These results 
are by no means surprising. Given an inspection of the lists, it is obvious that one 
will be easy to remember and the other extremely hard. But what underlies these 
“clear intuitions”? Why is one list so easy and the other so hard? 

It seems reasonable to assume that there exist some unique resemblances 
between a whale and an elephant (for example). Both are animals that are 
relatively large. And we find that similes like “A whale is like a skyscraper” are 
also recalled excellently given the subject noun; hence the fact that a whale and 
elephant are both animals does not necessarily play a crucial role. On the other 
hand, a whale and a rabbit are both animals that exist, breathe, eat, move, etc. 
But a host of other animals fit this description as well. Such information 
therefore fails to help one differentiate among other acquisition experiences that 
occurred. 

Of course, it is tempting to assume that the subjects in Stein’s “soft-hard” 
condition simply failed to “form” the appropriate “connections” among items. 
Similarily, it seems intuitively obvious that similes such as A whale is like a 
rabbit and A gnat is like a bull are less adequately comprehended than similes 
such as A whale is like an elephant; A gnat is like a mosquito, etc. With respect 
to the two lists of animal similes, for example, one can literally “feel” the 
greater degree of comprehension (e.g., see Bransford & McCarrell, 1974) or 
“depth or processing” (e.g., see Craik & Lockhart, 1972) in the second list as 
opposed to the first. Put more globally, it seems obvious that “quality of 
encoding” can affect accuracies in remembering. But what is the nature of the 
relationship between encoding and remembering? Does quality of encoding 
affect the “strength” of stored traces and connections among traces? Perhaps an 
emphasis on re-creation suggests an alternative metaphor. Perhaps the degree to 
which events, at their original occurance, are differentiated from the rest of 
one’s knowledge affects the ease of re-creating such experiences at a later time. 

An additional experimental example may further clarify the “strength of stored 
connections” versus “re-creation/differentiation” issue. Consider a list of 15 
sentences that all include different proper names in the subject and object 
positions (for example, John is talking with Sally; Bill is helping Sam; Betty is 
eating lunch with Ralph). Following a single acquisition trial, the subjects are 
provided with the subject nouns (for example, John, Betty) and asked to recall 
the sentence that went with each name. Informal pilot data suggest that the 
subjects do very poorly in this task. Now contrast this situation with one in 
which the acquisition sentences contain names of people with whom the subjects 
are well acquainted. For example, one might use the names (including last 
names) of professors and fellow students that all the subjects know. Under these 
latter conditions, our informal pilot data suggest that subjects’ performance 
improves markedly. Furthermore, if one can articulate a reason why two people 
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are interacting, the task seems even easier. For example, if X is Y’s student and 
one knows that they are working on a particular project together, then a 
statement like Y is talking with X seems to be more “richly” understood and 
easier to re-create. 

The mere “fact” that previous knowledge about the participants in events can 
affect comprehending and remembering is not really at issue. Instead, the issue 
seems to be the manner in which one characterizes such “past knowledge” 
effects. For example, does past knowledge allow one to form “stronger” or a 
“greater number” of connections between subject and object items? Or does 
past knowledge set the stage for the articulation of unique events that are 
therefore more differentiable from other knowledge? 

In this context, it is useful to consider Anderson and Bower’s (1973) state- 
ments about meaningfulness and connectionism. They note that an important 
characteristic of many mnemonic devices is that “. . . they make the information 
to be remembered more memorable by adding further information. This addi- 
tion of further information should mean more elements to associate together, so, 
on most associationist accounts, such mnemonics should make the task harder 
rather than easier [p. 57].” Later, they state: “The fact that the best remem- 
bered material appears to go against the spirit of associationism is an embarrass- 
ment to every existing associationist theory. However, being embarrassed is not 
the same as being proven false [p.57] .” 

We agree that HAM can seemingly overcome many of these “embarrass- 
ments”’—especially by making assumptions about linguistic and perceptual 
parsers. But we believe it unfruitful to separate problems of remembering from 
problems of comprehending and perceiving. In addition, by emphasizing dif- 
ferentiation as a crucial process underlying re-creation (i.e., remembering), one is 
not embarrassed upon finding that meaningful events facilitate remembering. In 
general, meaningful events are those that are more differentiated. Therefore one’s 
ability to re-create an event is not simply a function of the “meaning similarity” 
between a probe and a trace component (note that the probe “boy” is very 
similar to all “boy” traces). Furthermore, the ability to remember should not 
always be a simple linear function of the number of connections between a 
“stimulus” and “response” item, For example, a simile such as A whale is like a 
rabbit could permit a large number of connections (both breathe, eat, move, live, 
exist, etc.), whereas a simile such as A whale is like a skyscraper may permit only 
a single “connection.” Despite this, the latter simile may be more effectively 
cued recalled (given the subject noun) as a function of the degree to which a 
certain relationship permits one to differentiate among other items experienced 
on the acquisition list. 

One advantage of emphasizing a differentiation-te-creation view of remember- 
ing is that it stresses how one’s ability to remember a particular input will 
depend on the context of other items in which it was originally embedded. For 
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example, consider once again the two different lists of animal similes. In these 
lists, similes such as A whale is like a rabbit were much more difficult to 
remember than were similes such as A whale is like an elephant. Similes such as 
the latter presumably allowed subjects to articulate unique, well-differentiated 
relationships between both members of each pair. But it is important to note 
that certain well-differentiated resemblances (for example, between a whale and 
an elephant) are unique only in relation to a total set of acquisition experiences. 
The same argument holds for the unique relationship between whales and 
skyscrapers as well. It seems clear that “uniqueness” is a relational “property,” 
and cue effectiveness should therefore vary as a function of the total set of 
acquisition experiences. For example, we have presented the subjects with a set 
of similes like A whale is like a skyscraper, An elephant is like a redwood, A 
moose is like a giant, etc. Under these conditions subjects begin to have trouble 
accurately connecting each “stimulus” with each “response.” Similarly, we find 
that a sentence like A sparrow is like a rocket is accurately cued recalled under 
some list conditions. But if this sentence is embedded with others like A hawk is 
like an airplane, A robin is like a kite, etc., frequent mistakes and mispairings 
arise once again. 

The problems herein encountered are intuitively obvious. It is well known that 
“similarity” among items can cause “interference.” But the fact that such 
problems seem obvious should not detract from considerations of their impor- 
tance. In fact, any framework would be inadequate if it did not readily expect 
such “obvious facts.” Note that the present framework emphasizes that in- 
terference is not simply due to normative “stimulus” or “response” similar- 
ity. Instead, interference is due to the uniqueness of acquisition events. For 
example, a robin and a hawk are understood in a similar manner when em- 
bedded in similes such as A robin is like an airplane; A hawk is like a kite. 
But if one hears A robin is like a stoplight, A hawk is like a kite, ètc., we find 
that there is less confusion among the “stimulus” items and their respective 
“responses.” i | 

One way to view these problems of interference is to note that it is impossible 
to adequately predict remembering by considering only the local properties of 
individual “traces” of inputs. The global properties of a total set of experiences 
play an important role in determining subjects’ abilities to adequately re-create 
previous events. Indeed, even a simile like A whale is like a rabbit can be well 
remembered (for example, given the cue “whale”) if it is embedded ina set of 
totally unrelated (for example, nonanimal) similes. Remembering is a function 
of the total set of experiences to which an input belongs (see Garner, 1974). 


From the present perspective, the problem of re-creating an experience is there- 


fore a function of the degree to which probes help to differentiate among or 


uniquely specify one of a set of previous experiences. We assume a m 
degree of differentiation or focusing is a prerequisite for achieving sufficient 
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resolution, and that probe effectiveness therefore depends on the relationship 
between probes and the total set of acquisition experiences one has had.” 


Recognition as a Function of the Set to Which Items Belong 


Global aspects of the total set of acquisition experiences have powerful affects 
on recognition as well as cued recall performance. For example, assume that the 
subjects hear a sentence like The rock crushed the hut embedded in a set of 23 
additional acquisition sentences. It is obvious that subjects will be excellent at 
recognizing this sentence if it is semantically unrelated to the other acquisition 
sentences (e.g., see Peterson & McIntyre, 1973; Shepard, 1967). But if the same 
sentence is embedded among a set of semantically related sentences as in the 
Bransford and Franks experiments (e.g., Bransford & Franks, 1971, 1973), only 
a small percentage of subjects will think that they heard that particular sentence 
before. 

Of course, it is easy to dismiss the Bransford—Franks type of results as simply 
due to “confusion” or “interference.” But the subjects are not confused about 
the global information derived from the total set of acquisition experiences. They 
know the overall events that were communicated. They do, however, have 
trouble distinguishing between information that was actually presented and 
information consistent with the overall, invariant events that were described. 
This problem can be seen as analogous to one in which a blindfolded person is 
presented with a number of successive “touches” in order to identify an object. 
The structure that is ultimately apprehended will be specified by invariant 
information detected over time (e.g., see Gibson, 1966). If the person is later 
asked to distinguish the overall object from a set of additional objects, he will 
most likely perform accurately. But if the person’s hand is placed on the object 
in various ways, and he is asked to distinguish between his exact acquisition 
touches and other possible touches of the object, his performance may be 
relatively poor. 

This problem can also be viewed from a slightly different perspective. At 
Vanderbilt, Normand Benjamin is currently pursuing a project that involves the 
perception of a single object (that is, a black circle) moving across a stationary 
background grid. One group of subjects sees a series of slides that depict discrete 
snapshots of the circle tracing a continuous trajectory (e.g., like a falling ball 


"Note that lists of highly similar items (all animals, all proper names, etc.) frequently 
result in many “errors” in remembering. It might be argued that such lists are unrepresenta- 
tive, and overemphasize “errors.” However, consider the fact that most classroom situations 
involve attempts to teach general ideas and concepts that are initially undifferentiated from 
one another. Students can frequently grasp abstract, global characteristics of the to-be- 
learned material, yet be unable to articulate precise examples and arguments that they 
experienced, This suggests that memory studies that use undifferentiated lists may be more 
representative of real-world situations than are studies that utilize initially highly differen- 
tiated lists. 
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that hits something and bounces back up). Following this acquisition sequence, a 
perceiver is aware of this continuous trajectory. Intuitively, one is more con- 
fident about the overall trajectory than about the discrete snapshots of the ball 
that were seen. In fact, preliminary results indicate that the subjects have 
difficulty differentiating between OLD and NEW static snapshots of the ball 
located on the acquired trajectory. But any snapshot that shows the ball at a 
position removed from the overall trajectory is easily recognized as something 
different from the acquisition set. Benjamin’s study also includes a second group 
of subjects who see the same slides during acquisition, but the slides are 
randomized so that a continuous trajectory does not emerge. Preliminary results 
indicate that these subjects are less likely than Group 1 subjects to falsely 
recognize NEW snapshots if the latter were snapshots depicting the ball on the 
trajectory acquired by subjects in Group 1. Benjamin’s results suggest that the 
invariant information detected as a function of the total set of acquisition 
experiences constitutes the most immediate and salient aspect of “what was 
learned.” 

Of course, the previously noted Bransford-Franks experiments might not be 
analogs to the “touching” example or to Benjamin’s experiments. Other ac- 
counts of the Bransford-Franks experiments have been proposed. For example, 
Anderson and Bower (1973) assume that subjects count the number of “momen- 
tary context tags” associated with each proposition in a test sentence (e.g., see 
Anderson & Bower, 1973, pp. 347-352). The greater the number of tags 
associated with the propositions in each test sentence, the greater the probability 
that the subjects will say that they heard that sentence before. But assume that 
one presents subjects with a 24-sentence acquisition list structured like those in 
the Bransford-Franks experiments and includes one additional, single proposi- 
tion sentence that is semantically unrelated to the other sentences. The subjects 
will be excellent at recognizing this sentence despite the fact that there is only 
one “momentary context tag” to count.’ According to Anderson and Bower 
(1973) recognition performance given unrelated sentences involves the use of 
different decision strategies (e.g., see Anderson & Bower, 1973, p. 348). But ina 
mixed experiment, what tells the subjects which strategy to use? That is, in 
order to use different decision strategies, one must differentiate between related 
and unrelated sentences. Hypotheses that different strategies are used for related 
and unrelated items merely begs the question of how the initial differentiation 
occurs. 

The problem of “specific remembering” of unique events can also be viewed 
from a different perspective. In general, subjects in the Bransford-Franks para- 
digm have difficulty distinguishing between OLD and NEW sentences. Yet there 


® We ran this experiment just to be safe. All subjects recognized the ONE The cat was seared 
when it was unrelated to other acquisition sentences. Less than 40% of the subjects recognized 
the same sentence when it was included with five additional sentences (some ONES, TWOS, 
and THREES) about a cat (as in the Bransford—Franks paradigm). 


458 BRANSFORD, McCARRELL, FRANKS, AND NITSCH 


is some ability to differentiate among these items; especially at the level of 
ONES (e.g., see Anderson & Bower, 1973, pp. 350-351; Bransford & Franks, 
1971). However, a reanalysis of the OLD-NEW data from Bransford and Franks 
(1971, Experiment III) indicates that only certain ONES contribute to this 
difference. The reanalysis was motived by Singer and Rosenberg’s (1973) find- 
ings that the subjects’ “yes” responses to test sentences were strongly affected 
by the presence or absence of the main clause of the overall events (and see 
Franks & Bransford, 1974). 

The ONE sentences from Bransford and Franks (1971, Experiment TIT) were 
partitioned into those that did versus did not contain the main clause of the 
events acquired during acquisition. For example, the ONE The rock crushed the 
hut was assumed to express the main clause of the FOUR The rock which rolled 
down the mountain crushed the tiny hut at the edge of the woods. Results 
indicated that there was no difference between OLD and NEW ONES that 
contained the main clause (for example, percent yes ratings were 37.3% and 
34.6% for OLDS and NEWS, respectively). There was, however, a marked 
difference between ratings for OLD versus NEW ONES that did not contain the 
main clause (that is, percent yes ratings were 36.4% and 17.7% for OLDS and 
NEWS, respectively). 

Consider these results in connection with the fact that a completely unique 
sentence embedded in a Bransford—Franks paradigm is extremely well recog- 
nized. If the subjects are listening to a set of semantically related acquisition 
sentences and hear a ONE like The hut was tiny, at least some of them may 
perceive it as unrelated to any previous experiences. Lack of information about 
the main clause may prompt the subjects to perceive such ONE’s as unique. 
During recognition, these subjects should therefore recognize such paticular 
utterances, hence there should be overall recognition differences depending on 
whether such ONES were OLD or NEW. On the other hand, ONES with main 
clauses should be less likely to be perceived as unique during acquisition. There- 
fore, ONES with main clauses would be expected to receive equivalent ratings 
irrespective of whether they were OLD versus NEW. Considerations such as these 
may help clarify why researchers such as Reitman and Bower (1973) found such 
high degrees of specific “memory.” They presented subjects with inputs such as 
12, 14, 134, 1, etc., and assumed that subjects would use these to form an 
“overall idea” such as 1234. But why should subjects perceive inputs such as 
twelve, fourteen, etc. as subparts of an “idea” like 1234? 

From the present perspective, the important issue is not whether people are 
always accurate versus inaccurate in their abilities to remember. Instead, the 
issue involves the conditions under which subjects can re-create previous experi- 
ences they have had. One can, of course, concentrate on accuracies and assume 
that inaccuracies are due to “errors,” “confusion,” “interference,” etc. On the 
other hand, one can focus on the acquisition of global, invariant information as a 
function of a total set of acquisition experiences. In Benjamin’s study with the 


16. TOWARD UNEXPLAINING MEMORY 459 


moving dot, for example, it seems clear that the continuous trajectory (for 
Group 1 subjects) is the most salient aspect of what was learned. The information 
derived from the total set of acquisition experiences sets constraints on one’s 
abilities to re-create previous acquisition experiences (i.e., remember). In short, it 
appears that what is learned cannot simply be equated with a list of previous 
memories. In fact, it would appear to be the global nature of what was learned 
that sets constraints on one’s abilities to re-create (i.e., remember) those inputs 
experienced during the acquisition task.? 


Further Constraints on Recognition 


An emphasis on the total set of acquisition events also highlights another 
problem: Given a recognition test, how do subjects know that they have not 
experienced a particular item during acquisition? Do they have to search through 
all their stored traces in order to realize that a foil does not “match” any of 
these traces? Do subjects know “where to look” for a foil and then check to see 
if it has an appropriate context tag? : 

There are conditions under which subjects do appear to “search the acquisition 
set” in order to reject items (e.g., Sternberg, 1966). In these studies, the larger 
the size of the acquisition set the greater time it takes to say “no” (as well as 
“yes”). But even under these very short-term memory conditions, decisions are 
affected by the global structure of the set of acquisition items. For example, 
Morin, De Rosa, and Stultz (1967) presented the subjects with acquisition sets 
like 3456. They found that the subjects’ responses were a function of the 
“distance” between probes and the list boundary. For example, subjects were 
faster at rejecting 9 than 7. > 

Results such as those of Morin ef al. (1967) are not gracefully incorporated 
into “exhaustive search” models nor “look for a context tag” models. Yet it is 
easy to believe that such results occur only under special circumstances and with 
special kinds of materials. Perhaps they represent the exception rather than the 
rule. But assume that subjects hear a list of nonsense syllables and We a 
presented with foils—one of which is a picture of a red triangle. Such oe N 
be so readily rejected that it’s presentation would seem ae eae 
subjects’ (and experimenter’s) point of view. Or assume that the subjects ve 
list of words that all have some invariant semantic structure (for an ps 
examples of fruits), and are later presented with e like truck or rock. 
again, the subjects should be able to reject these with ease. 

Of course, the preceding examples are obvious. And there are N, 
to conceptualize such results. For example, one could assume tha j 


when one 
°Problems of re-creating previous experiences become even E (sayasa 
attempts to remember things like “how I used to think about a partic s to play an impor- 
graduate student) etc. In these situations, one’s present pet 
tant role in determining the nature of the “memories” he re-creates. 
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“store” a proposition like “all inputs are nonsense syllables,” “all inputs are 
fruits,” etc. One could then readily reject foils that did not “fit” the constraints 
that the proposition expressed. But would this approach illuminate the problem 
or bury it? For example, how does one know that a foil is not a nonsense 
syllable, or not a fruit? Does one first find this type of information in “semantic 
memory” and then use it to “cut off” a search of “episodic memory”? If so, 
then how does semantic memory operate? For example, if one is asked “Do you 
know the meaning of salakan?,” does he have to search the entire contents of his 
memory in order to arrive at a negative answer? Similarly, does one have to 
search all possible information in order to reject a statement like Some chairs are 
people? Such decisions are made very quickly (Meyer, 1970); and the further 
apart the conceptual boundaries, the easier the negative decisions are to make 
(e.g., Moyer, 1973). 

It seems clear that the “rejection” problem occurs in “semantic” as well as 
“episodic” memory. To ignore it in accounts of “memory” may be to ignore one 
of the most important sources of constraint on adequate theories of remember- 
ing. From the present perspective, work on this problem may help illuminate 
how resonance operates as a function of one’s level of attunement, and how 
one’s current level of attunement can be changed and focused in response to a 
particular probe. With respect to “episodic memory,” the subjects seem to reject 
items which afford no re-creation of a previous experience. Some of our investi- 
gations into the problem reveal statements such as “I’m positive I didn’t experi- 
ence that during acquisition because I know that I would have especially 
noted it if I did.” The problem of rejecting items may therefore be intimately 
related to the problem of why we are not constantly deluged by the re-creation 
of previous, personal experiences with familiar inputs. Problems of resonance are 
inextricably bound to characterizations of one’s present level of attunement. We 
know very little about these attunement processes, but it seems reasonable to 
assume that in a typical memory experiment one’s level of attunement will be a 
function of abstract characteristics of the total set of acquisition experiences. 
The ease with which subjects can reject items should therefore be a function of a 
foil’s relationship to the set of acquisition experiences as a whole. 


SUMMARY AND CONCLUSION 


We have questioned some of the assumptions underlying the prevailing memory 
metaphor. Spatial components of the metaphor include assumptions such as 
“traces are stored in certain locations” and “certain traces are stored together in 
episodic memory” (for example, as in hierarchical organization). Similar assump- 
tions are frequently made about “semantic memory;” for example, that “ideas” 
are stored at greater or lesser distances from one another and that this affects the 
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time to get from one place to another, the speed of “spreading activation” (cf. 
Meyer, Schvaneveldt, & Ruddy, 1972), etc. These spatial assumptions strongly 
affect conceptualizations of the nature of search and retrieval processes, for 
example, that one must know where to look to find a stored trace (plus its 
appropriate context tag). 

The metaphorical status of these assumptions is not the problematic issue. 
Indeed, some theorists (e.g., Hesse, 1966) have argued that all explanations are of 
the nature of metaphoric redescriptions. The issue, then, concerns the fruitful- 
ness of one’s metaphors. The issue concerns the tradeoff between the memory 
metaphor’s ability to help conceptualize certain phenomena and its tendency to 
foster neglect of other problems that might otherwise stimulate theoretical 
growth in a number of areas. 

Consider the assumption that certain “ideas” are stored at certain “distances” 
from one another in “semantic memory.” Such conceptualizations convey the 
impression of some static form of storage invariably accessed when making 
decisions, For example, the “distance metaphor” might capture the notion that 
given a triad like shirt-shoe-rock, shirt and shoe seem to be “closer together” than 
rock and shoe. But assume that one is looking for objects that could be used to 
pound a nail into the wall. Now shoe and rock seem to be “closer together” than 
shoe and shirt. In short, static conceptualizations of semantic memory fail to 
orient one toward the flexibility of understood relationships as a function of the 
context or setting in which events occur (e.g., see Anderson & Ortony, 1976; 
Barclay, Bransford, Franks, McCarrell, & Nitsch, 1974; Barclay & Jahn, in press; 
Bransford, McCarrell, & Nitsch, 1976). 

This chapter has also questioned the fruitfulness of assuming the episodic 
events are “stored together.” Assumptions about spatially organized structures 
may “explain” some aspects of free recall performance (for example, clustering). 
Yet it was also argued that the explicit presentation of retrieval probes allows 
the subjects to “bypass” such hypothesized structures. That is, if the subjects 
had to “search through” such structures in order to “find” traces containing the 
information specified by a retrieval probe (for example, a “trace” containing 
the word violin) then it is not clear why they could not simply have searched and 
found that trace under nonprobe conditions. We therefore suggested that phe- 
nomena such as clustering in free recall may reflect more about the nature of the 
cues implicitly accessible to the subjects than the “spatially organized storage 
of the inputs per se. 

The preceding arguments make certain assumptions about the pr > 
which probes help one remember previous experiences. If the presentation of a 
probe simply tells one “where to look” to “find” a certain item then the spatial 
components of the memory metaphor are still reasonable. But how could a 
probe tell one “where to look”? If probes somehow “specify the addresses of 
certain stored items, then why is the effectiveness of probes limited? For 


ocesses by 
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example, why are some components of an event more effective than others for 
“retrieving” the total event (e.g., see Peterson, 1974). Or why does the same cue 
for :28 different sentences fail to facilitate recall? Because of a limit on the 
number of “addresses” that can be attached to a particular item (for example, 
“boy”)? Because of a cut-off on search? In our opinion, it is unfruitful to assume 
that probes tell one “where to look” for previously stored traces. It may be as 
unfruitful to assume that cues retrieve “idea traces” as it is to assume that a 
sunny day acts as a cue to help one retrieve “happiness traces” or “spring fever 
traces.” It seems more fruitful to ask how cues help one re-create previous 
experiences. 

Following Bartlett (1932), we suggested that remembering involves processes 
similar to thinking. Both processes frequently begin with an abstract “attitude” 
(cf. Bartlett, 1932) or “level of attunement” that needs to be further articulated 
or focused. The “retrieval” problem underlying remembering was assumed to be 
analogous to a process of helping one further focus or articulate his thoughts. 
The most powerful variable affecting cued recall as well as recognition seems to 
be the relationship between the probe and the total set of acquisition experi- 
ences, A probe must allow one to further focus his thought processes by 
permitting a sufficient degree of differentiation among previous experiences. 
And the degree of differentiation is relative to the subjects current level of 
“attunement” resulting from the set of acquisition experiences as a whole. 

Of course, problems of differentiation are one of the most widely known 
“facts” in psychology. Everyone realizes the importance of constructing experi- 
mental lists so as to avoid unnecessarily “confusing” the subjects. And it hardly 
seems surprising that given a “confusable” list, the subjects do make “mistakes.” 
But what do these obvious facts indicate about the nature of remembering? In 
particular, what do they indicate about the notion that an input results in a 
relatively independent, stored trace? 

Assume that the subjects are presented with discrete, unrelated inputs (e.g., 
Shepard, 1967). Under these conditions, the subjects look like “exact copy” 
mechanisms that simply store traces. Yet one can radically change the prob- 
ability that particular inputs are remembered by manipulating the acquisition set 
in which they are embedded. For example, Bransford, Franks, and Nitsch (in 
preparation) find that the same schematic face may be either poorly or well 
recognized depending on its relationship to the other faces shown during 
acquisition, And when the subjects are provided with inputs that allow the 
specification of certain wholistic, invariant structures (e.g., Benjamin’s on going 
work; Bransford & Franks, 1971), their “memories” are more related to these 
wholistic structures than to “copies” of the exact inputs per se. Similar variables 
appear to affect the “strength of connections” between traces or trace compo- 
nents. For example, a simile like A sparrow is like a rocket appears to result in 
the formation of a “strong” sparrow-rocket “connection” when it occurs in 
some acquisition contexts. Yet in other acquisition contexts, the “connection” 
appears to be weak. 
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The preceding arguments illustrate some of the challenges of contextualism 
(see Jenkins, this volume). From this perspective, “context” is not simply a 
variable to be manipulated. Instead, contextualism suggests an orientation 
toward problems that questions many prevailing assumptions; for example, the 
assumption that the problem of remembering is equivalent to the problem of 
“finding” discrete traces and “following the connections” from one element of a 
trace to the next. And a contextualist orientation suggests methodological 
caveats. For example, remembering is assumed to be a function of the total set 
of acquisition experiences (that is, of the global context in which inputs are 
embedded). Memory performance is therefore not simply a function of local 
properties of individual traces, but is rather a function of the global character- 
istics of the set of acquisition experiences as a whole. It follows that certain 
variables may or may not be important depending on the acquisition context. 
For example, it seems clear that one could set up conditions in which subject 
nouns are better retrieval cues than object nouns. Yet one can also do the 
reverse, If one’s theoretical purpose is to make general statements about particu- 
lar populations of entities (for example, about the retrieval effectiveness of 
subject nouns versus object nouns; about recognition accuracies for four proposi- 
tion versus one proposition acquisition sentences) then he can quickly run into 
difficulty. The effectiveness of such variables will be a function of the relation- 
ship between inputs and the acquisition set as a whole. 

We close by noting that a basic reason for interest in the problem of “mem- 
ory” involves a concern with the processes by which past experiences affect 
one’s subsequent interactions with the environment. For example, it is fre- 
quently assumed that such interactions depend on the nature of one’s knowledge 
stored in “long-term memory.” From this perspective, one’s actions and inter- 
pretations are therefore assumed to be mediated by “contact” with previous 
“memories” that were stored. Yet we have suggested that “what is learned” 
cannot be reduced to the problem of how one forms and stores “memories.” In 
fact, we have argued that learning involves the detection of invariant information 
from a set of acquisition experiences, and that one’s ability to accurately 
remember depends on “what was learned.” If the problem of remembering is 
assumed to depend on the organism’s state of attunement following acquisition, 
and if learning is viewed as the problem of how organisms become attuned to 
perceive situations, then Gibson’s (see Section 1) claim that perception does not 
depend on memory (that is, on contact with previously stored memories) 


becomes an extremely important idea to pursue. 
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Ecological approach (contd.) 
compared with information processing, 
5-7, 10 
compared with psychophysical approach, 
47, 63 
empirical objectivity and phenomenologi- 
cal reference, 110 
Ecological gustation, 52 
Ecological haptics, 52, 54, 63 
Ecological information 
for actions, 235, 246, 250-253 
for age, 124-129 
attensity, 55 
conflicts and contradictions, 58-59 
contained in light, 50 
definition of for vision, 50 
dependence of direct perception on, 57 
detection of, 49-50 
equivalence of pick up and experience, 111 
global nature of, 129 
to indicate structure, 50 
ys. information in Shannon and Weaver 
sense, 50 
information space, 107 
for locomotion, 50, 246-249, 251-259 
as meaningful, 285, 305, see also 
Affordances 
for motion and shape, 51 
perceptual information vs. physical 
measurements, 104 
in pictures, 62 
specification of affordances by, 111 
for speech and visual form, 282-289 
utility of, 59 
variables of stimulation vs. stimulus 
information, 50, 57 
Ecological olfaction, 52 
Ecological optics, 54, 73-74, 79-82, 285 
goal of, 53 
research in, 54-55 
Ecological psychology, 200, 206, see also 
Ecological approach 
Education of attention, 80, 107, 180 
and differentiation, 49 
and perception, 433 
vs. enrichment, 434-435 
Effector systems, 277 
Effects of art, definition of, 136-137 
depth, 137 
four major types of, 137-138 
hierarchy of, 139 
vs. suggestions and evocations, 136-137 
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Efference, 211, 248 
alpha-efferent system, 273 
and vision, 286-287 
Efficient cause, 171, see also Aristotle’s four 
causes 
Egocentrism, 186 
Elastic, component of growth, 116-117, 
124, see also Ecological information, 
viscal and age 
Elastic, defined, 123-124 
Elastic events, 129 
Empiricism, 15, 22 
cryptoneoempiricism, 15 
radical, 15,22 
Entrainment, 307 
definition of, 306 
Environment, 4, 43,45, 48, 173 
affordances of, 68-70, 72, 77 
for animal locomotion, 246, 248 
and computation, 194, 199-200, 206 
consonantal, for perception of vowels, 
331-335 
contingencies of, 245 
environmental structure, 53-54 
events in, 245, 373, 381 
functional, 6 
importance to Gibson’s theory, 53-56 
invariants of, 250-259, 279 
material, 6 
media of, 70 
natural vs. artificial, 70-71 
noisy, 317 
physical, 23 
projected as action plans, 229-230, 235, 
251-259, 274 
relationships within, 306 
stable points of, 48 
substances of, 70 
for tuning action system, 252-259 
variants of, 250-259 
Epistemic mediation in perception, 20-21, 
23, 29, 31-34, 36, 38 
Equilibration, hypothesis for cognitive 
development, 173-174, 178 
of joint-muscle system, 240-244 
Events 
and abstract ideas, 421 
elastic, 129 
and experience, 415-416, 425-426 
experiments concerning recognition of, 
421-428 


Events (contd.) 
importance to experimental psychology, 
427 
information in light for, 50 
as interation of subject with object of 
attention, 420 
global and local, 129, 426 
perception of, 124-130, 248-259, 
373-376, 385, 417 
and problem solving, 424 
in recall, 418,420 
recognition of, 417 
and remembering, 454 
as result of perception, 56 
sensitivity of organism to, 48 
specification of, 50 
specifying affordances, 110, 420 
storage hypothesis, 461 
structure of environmental, 52 
as structured over time, 56 
unity of spatiotemporally separated, 423 
viscal, 129 
Evolution, 10, 32,55, 219 
evolutionary attunement, 16-17, 61, 107 
of behavior, 246 
incompatibility with indirect theories of 
perception, 63 
of perceptual systems, 107 
Evolutionism, 15, 22, 69 
Experience, and consciousness, 282, 
286-287, 292-294 
constitutive vs. preparatory, 33-36 
ecological approach to, 111 
and events, 415-416, 418, 425-427 
everyday, 268 
of locomotion, 283 
sensory, 279, 289-290, 305-306 
voluntary, 293 
Executive system for action 
as coalitional, 252-259 
in coordinated activity, 215, 217-218, 
220-227, 233, 240 
procedures of, 253 
specification of action plan through, 245, 
250 
Exploration of ambient array 
vis à vis legs, 61 
vs. response to, 51 
Exploratory behavior, 
in computation, 201 
in development, 180-187 


Exploratory behavior (contd.) 
relation to perception, 250-259 
visual guidance for, 246-248 

Eye, 
cultural, 160-161 
disposition of, in art, 138 
innocent, 164 
movements in sleep, 289 
natural, 160-161, 164 
nature of, in art, 138, 163 
order in which it reads paintings, 139 

Eyeside, in art, 146-147 
aspects of attention, 145-146 
introduction of term, 141 
modes of attention in abstract art, 

146-147 
peculiarities of, 148-150, 169 


F 


Feedback, 193, 240 
in joint muscle adjustment, 240-244, 246 
internal to nervous adjustment, 240-244, 
246 
Feedforward, coalitional, in guided 
locomotion, 246-248, 253 
Figurative knowing hypothesis, 171-17 3, 
177 
Figure-ground, information for, 52 
Force, in motor functioning, 216-217, 
229-230 
Form perception, in art, 139 
Fourier analysis, 85, 91 
Fovea, 85 
Function(s) 
biological, 171 
cerebral, 211 
as concepts, 215 
effective, 197-198, see also Algorithm 
functional activity, 184-185 
intellectual, 186 
mathematical, 204-205, 212, 226, 
241-244 
motor, 212, 216-217, 223, 226-227, 
240-245 
of referent, in meaning, 361 
of tuning in action, 249-250 
Functionalism, 1 
Functional units, see also Coordinative 
structures 
of reflexes, 220 


SUBJECTINDEX 481 


G 


Ganzfeld, 45 
Generator, 220 
for concept formation, 277 
for pattern recognition, 393-394 
Geometric approach to perceptual theory, 
110-112 
as merging of psychology and epistemol- 
ogy, 112 
Geometries 
definitions of, 106, 112 
of growth space, 112 
hierarchy of, 107, 113 
symmetry operations in, 112-116 
transformations in, defined, 119-120 
Geometry, see also Topology 
Affine, defined, 114 
group of transformations, 114-115 
shape in, 130 
shear in, 124 
strain in, 121-122 
Metric-Euclidean, defined, 112-113 
group of transformation, 114 
in shape perception, 103-105, 114, 130 
Projective, defined, 115 
as nonmetric, 130 
strain in, 122-123 
Riemann’s approach, 118-120, 123, 125 
Gestalt psychology, 1,49, 63-64, 77, 107 
laws of, 137, 158, 162, 165 
Givens as momentary slices in time, 57 
Grammatical form 
deviant, 367, 369, 371 
normal, 367 
not related to resemblance, 380 
Ground theory of space perception, 46, 72 
Group concept, see also entries under 
Invariants; Symmetries; Transformations 
mathematical, 259, 279 


H 


Habitat, 6, 69 

Hallucination, 287-289 

Haptic perception of shape, 52 

Heterarchical organization, of action 
systems, 221-225, 240, 245 

Heterarchies, in action, 221-225, 240-245, 
251-259 

defined, 221 
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Hierarchical organization, 303 
in acoustical speech patterns, 315 
in semantic memory, 395, 398, 400 
Hierarchies, 3, 8, 221-225 
among memory traces, 460 
in consciousness, 293 
of geometries, 107, 113 
Hull’s habit-family, 414 
of theories, 204-205 
Higher order variables, in direct perception, 
44-46, 54, 60, 68,71 
Homeomorphism, see also Topology and 
Transformations 
defined, 117 
in growth transformations, 119 
in recognition of shape growth in space, 
118 
and simple closed curves, 123 
Homunculus hypothesis, 7-10 
and retinal image, 62 
ultimate neuron, 8 
ultimate perceiver, 62 
Höffding step, 283-285 
Holograms, model for memory, 83-85, 89, 
99, 298-299 
Holonomic functioning, 86-89, 96 
Homeostasis, 173-174 
Homeorhesis, 174 
Horizon, information specifying, 50, 60-61 
specification of size of objects seen against 
it, 55 


Iconic store hypothesis, 7 
Identity, perception of individual, 126-128 
Illusions, 43, 89, 93-94 
as decontextualized illustrations, 138 
examples of geometric, 104 
explanation of, 106-107 
lack of causal explanation, 168 
need for causal explanation, 152 
of space in pictures, 147 
susceptibility of chicks to, 107 
Image(s) 
animation by memory, 160 
direct, of space, 56 
formation, 47, 97 
in geometric mappings, 120 
and hallucinations, 287-289 
motor theory of, 288 


Image(s) (contd.) 
neurophysiological, hypothesis of, 83, 89, 
90,92 
as not the direct objects of perception, 63 
of paintings, 142 
storage by, 84,270 
Immanent formism 
attack on by Michotte and Gibson, 375 
definition of, 366 
and science, 386 
Indirect perception, 11, 23-30, 55-63, see 
also Constructivism 
and retinal image theories, 62 
Information 
about articulation, 319 
in communication theory, 6 
global invariant information about 
experience, 458 
haptic, 30, 80 
integration of in event recognition, 417, 
425-426 
perceptual, 88, 94,98, 175, 182, 185 
for age, 124-129 
for age, abstractness of, 127, 129-130 
for age, global, 129 
detection into components, 49 
ecological, 6, 127, 129, 235, 246, 
250-253 
hypothesis of transformation, 297 
for locomotion, 50, 246-249, 251-259 
modulation of in memory, 279 
for painting, 136 
pick up of, 253, 376 
for shape, 119, 129-131 
in transforming array, 50 
in semantic memory, 399 
in sentences as observor-dependent, 375 
signal to noise in speech and redundancy, 
316n. 
for speech normalization, 329 
in speech signal, 316 
subjective and objective poles of 
specification, 52 
in syllables, 326, 329 
for vowels, 322, 331 
Information-processing hypothesis, 1-2, 
5-7, 10, 44—45, 85-87, 97 
in motivation of action, 295 
in motor theory, 235, 238, 304 
Neisser’s concept of, 269 
Weimer’s analysis of, 283-285 


Inhibition 
in nervous functioning, 234, 250 
reciprocal, 234, 238 
Invariances between innervational impulses 
and evoked movements, 216 
environmental, 92, 98 
input, 92 
lack of in vowel production, 324, 338 
perception of, 172, 181, 185, 187, 249 
for phonemes, 319 
in species-specific behavior, 246 
in vocal tract shape, 340 
Invariant(s) 
in acoustic array, 319, 336, 339n. 
in affordance theory, 75, 80 
as compound-higher order, 80, 124-126, 
128-131, 252-253, 256-257 
defined over change, 56 
detection of, 129-130, 242, 245, 
250-251 
ecologically significant, 107 
fundamental environmental, 107 
information, 21, 49, 246 
relations, 279 
as stimulus, 49 
specifying events, 374 
specifying identity, 126-128, 256 
structural, see also Symmetry 
in aging of faces, 124, 126-128 
as ecological information, 113 
in geometry, 112-113 
in projection (ratio of similitude), 
114-112 
timeless and formless, 51 
transformational, see also Symmetry 
operation 
as ecological information, 113 
in geometry, 113 
and variant information, 49, 51, 249-259 
Isomorphism, 229 
of motor function descriptions, 258 


J 


Joints, skeletal, 213, 229, 233 
and muscle system, 240-245 


K 


Knowledge by aquaintance, 290, 378 
as consciousness, 292 
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Knowledge by aquaintance (contd.) 
and mind-body problem, 294 
in science, 382 


= 


Layout 
information for, 50 
of surfaces, 67, 70, 72-74, 76 
in visually controlled locomotion, 49, 
246-248 
Learning, see also Differentiation; 
Education of attention and Perceptual 
learning 
Behaviorist theory of, 278 
in cognitive theory, 175-176, 179, 181, 
205 
contractual theory of, 297-298, 304 
Gibson’s concept of, 433-434 
perceptual, 49, 171, 453 
Light 
frequency, 44 
intensity, 44 
Limbic system, role in memory, 302 
Linguistic forms, see Metaphor; Model; 
Simile; Symbol 
Localization of brain functions, Lashley’s 
refutation, 291 
Locomotion, 48, 77, 79, 80 
animate, 76 
through the environment, 283 
visually guided, 43, 49, 58, 246-259 
Logical positivism and Aristotelian view of 
language, 366 
Looming, 61, 73 


M 


Man-machine analogy, 2-5, see also 
Mechanistic theories 
Meaning(s) 
and affordances, 4, 44, 67, 77-82, 375 
and analysis of speech, 416 
and concept formation, 296-299, 347, see 
also Diachronous 
of events, 415-416, see also Affordances 
as independent of observor, 368 
and interpretation, 386-387 
linked 
characteristics of, 350 
compared to family resemblances, 359, 
414 
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Meaning(s) 
linked (contd.) 
contexts and, 352 
overlapping, 359 
synchronic, 352 
see also Chain complexes 
literal vs. figurative, 370 
of metaphors, 350, 372 
motor theory of, 296, 298, 299 
and pattern recognition, 296 
in perception, 284 
perception of, 372, see also Affordances 
perspectivist view of, 366 
preservation of in ecological information, 
118 
and problem of its production, 285 
as psychological relation, 372 
as reference alone, 366 
representation of, 296-299 
of sentences, 367, 371, 378, 382-383, 
420-424 
storage of, 437-438 
storage of elements of, 436 
surplus of, 386 
units of, 437 
of words 
aquisition of, 352 
as associational units, 414-415 
change of, see diachronic 
clustering of, 397-398 
complex interrelations in, 349, 359,416 
as constant, 348, see also Synchronic 
organization in linguistic meaning 
and context, 359, see also Contextu- 
alism 
disorganization in, 348, 360 
in events, 416 
problems of, 417 
and reference, 350-351, 361-362 
in relation to other aspects of words, 
417-418 
and semantic retrieval, 394, 418 
shifts in, 350, 352 
similarity in, 397 
Means-end analysis, 146 
Mechanism(s) 
of action, 272 
of coordination, 219, 224, 230, 240, 
246-248, 251-259 
of motivation, 295 
of perception, 44, see also Perceptual 
system 


Mechanism(s) (contd.) 
of reciprocal inhibition, 238 
of semantic retrieval, 394 
Mechanistic decomposition, 194, 196 
Mechanistic explanation, 414, see also 
Mechanistic theories 
Mechanistic models, 191, 198-199, 
206-207 
echanistic problems for speech perception, 
316-317, 323-324 
Mechanistic procedures, 192, 199-203, 207 
Mediated perception, 46, 55, see also 
Indirect perception 
Metaphor(s), comprehension of, 371, 
384-387 
extraction of ground of, 373 
in language use, 365, 367 
and meaning linkages, 380 
and metaphorical relations, 368 
neglect of by psychology, 366 
in science, 382, 386 
and the understanding, 387 
Memory 
associative, see Associative meta-theory of 
memory 
Bransford’s theory of, 279-280, 372 
comprehension in, 417-426 
concepts, 280 
contextualist meta-theory, 416-426 
Ebbinghaus’ concept of, 431-432 
ecologically valid problems for research, 
427-428 
enrichments, 434 
events in, 418 
Gibson’s concept of, 433 
holographic nonstorage approach, 
298-299 
holographic storage approach, 86-87 
James’ theory of, 418, 427 
Jenkins’ approach to, 415 
long term, 2, 270, 441 
mediation theory in, 413 
mnemonic devices for, 454 
organization in, 418 
and perception, 215, 285, 427 
and prompted recall, 372, 382-383 
reduction of experience to, 435 
retrieval cues, 442-443 
scanning hypothesis, problems with, 395 
semantic, 393 
semantic, retrieval of, 393-399, 407 
semantic vs. episodic, 460-461 


Memory (contd.) 
of sentences, 420-427 
short term, 205, 270 
storage hypothesis, 2, 4-5, 271, 298, 395, 
400-401 
Memory metaphor, 
criticism of, 433-434 
definition of, 432 
and exhaustive search, 435 
and novel information, 434 
and remembering, 439441 
search aspects, 444-449 
stored structures hypothesis, 446-448 
coding of, 439 
hierarchical organization, 460 
and input, 436 
and interpretation of events, 432 
local and global properties, 455-456 
and perception, 433 
storage and search of, 432 
Mind-body problem, 13, 62 
and commissurotomized brains, 293-294 
identity solution of, 8-9 
and knowledge by aquaintance, 294 
in perception, 108-109, 111 
Model(s) 
of action concepts, 215-220 
constructivist, 2,9, 45, 212, 268 
of semantic memory, 393-399 
dictionary-network model of semantic 
memory, 400-407, 409 
constructivist, hierarchical network for 
semantic memory, 394-397 
in language, 365 
motor, of meaning, 296 
of muscle-joint system, 240-244 
stimulus-reinforcement, 303 
Morphogenetic field(s), 96 
and growth space, 124-129 
Motion 
rectilinear, 48 
rigid and nonrigid, 55 
Motion parallax, 47-48 
Motion perspective, 47-48 
Motivation 
defined, 294 
problem of, 5 
Motor 
activity function, 212, 215, 240-244, 
246, 268, 274-275, 286 
concept of mind, 268-270, 274-275, 
299-300 
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Motor (contd.) 

cortex, 190, 225, 229, 273 

functioning, theories of, 212 

theory of imagery, 288 

Plato’s theory, 301-302 

rejection of distinction between motor 
and sensory, 61, 211-212, 259 

skills, 303-304 

theory of imagery, 258 

theory of meaning, 276, 296-299 

theory of memory, 280 

theory of mind, 291-292, 294 

theory of perception, 282-289, 302 

theory of speech, 282-283 

Muscles, 211, 213, 216-217, 220, 228-229, 
232, 236, see also Action; Coordinative 
structures; Reflexes 

agonist groupings of, 233, 235, 237-238, 
244 

antagonist groupings of, 230, 233-234, 
237, 244 

contraction of, 231 

flexion-extension of, 218 

innervation in, 221-225, 232 

and joint system, 240-244, 246 

receptors, 273 

reciprocal facilitation, 229 

reciprocal innervation, 21 6-220, 225, 231 

tuning, 218 


N 


Nativism, 13, 15,22 
ys. empiricism, 15, 22 
radical, 22 
Natural selection, 17, 32 
Nerve impulses 
central, 215 
continuous (dendritic), 92 
discrete (action potentials), 84 
and efficacy of mental, 306 
firing patterns, 8-9 
in perceptual systems, 109 
Nervous system, 83-99 
in action, 211-220 
in locomotion, as coalitional, 246-248, 
251-259 
Networks, 3 
Neurons 
afferent, 231 
alphamotoneurons, 230, 232-233, 237, 
243-245, 273 
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Neurons (contd.) 
gammamotoneurons, 232, 243-245, 273 
interneurons, 231, 233, 235 
Neurophysiological mechanisms 
for action, 212-220, 246-248, 250-259 
alpha-gamma links, 243 
gamma subsystem, 232, 273 
of memory, 83-87, 298-299 
two process hypothesis, 84 
Niche 
ecological, 6, 69, 70 
reality of, 69 
Nominalism, 11 
Normalcy, 136, 146, 156 
background of, 160-161 
nonverbal matters in, 161 
Number concept, development of, 176-181 


o 


Objects, concept hypothesis, 180-181, 206 
and growth transformations, 124—129 
perception of, 103 
seriation of, 357 

Objectside, 
and attention, 143-144 
introduction of term, 140-141 
and sensibility, 169 

Occlusion, 181-182 
as property in world, 53 
information for, 61 

Operative knowing hypothesis, 17 1-173, 

175,177, 179, 182 

Optic array 
natural ys. pictorial, 61-62 
surrounding point of observation, 51 
transformation of, 50, 60-61 

Optical flow, 47-49 
for moving observor, 55 

Orienting system 
in locomotion, 246 
and orientation, 254, 381 

Ostensive definition, and attention, 143 


P 


Paintings 
abstract, 135—169 
causal nature of effects of, 135 
cubism, 135, 143, 154, 159 


Paradigms 
exemplars, 281 
Kuhn’s theory of, 267 
Parametrization 
of environmental tuning, 240 
of joint-muscle systems, 241-245, 250 
in nervous system tuning, 240 
Pattern recognition 
as generative knowledge, 373-374 
and physical variables, 373 
problem defined, 283-284 
production of meaning, 285, 296 
and recognition, 306 
and resemblance, 380 
in speech perception, 317-318, 320, 323 
invariants in, 339-341 
by machine, 327 
normalization hypothesis in, 325, 329, 
330 
rejection of normalization, 338 
two-step procedure for vowels, 322 
for vowels in syllables, 322, 333 
Perception 
and action, 211-212, 222, 228, 246-248, 
255-259 
of aging faces, 55, 124-129 
auditory, 93, see also Speech perception 
as based on detection of information, 50 
of causality, 374-376 
as Causal process ys. epistemic act, 33, 
37-38 
of change, 52, 124-129, 248-259 
compared to imagination, 46, 55 
compared to inference, 28-29, 46, 55 
compared to memory, 29-30, 46, 55 
consciousness of, 282 
development of, in Piaget’s theory, 171 
as discrete, 182 
as direct, 252 
direct vs. indirect, 88-89 
as dual representation of neural events, 
259 
ecologically based theoretical language 
for, 109 
elementarism vs. holism, 36-38 
empirical antecedents vs. phenomenologi- 
cal consequents, 108 
of events, 129-130, 373-376, 385 
of faces, 117, 124-129 
in first person singular and plural, 
109-110 


Perception (contd.) 
Gibson’s approach to, 285, 381, see also 
Ecological approach 
and hallucination, 288-289 
and knowledge, 279 
language of perceptual theory, 108-109 
of layout, 131 
mechanisms of, 44—45 
motor theories, 282-289, 302 
and pattern recognition, 283-285 
problem of veridical, 288 
of nonchange, 52, 126-128, 248-259, 
278, 365 
second hand, 61 
somatosensory, 93 
space perception, 43,45 
of speech, see speech perception 
of transparent depth, 55 
two mode hypothesis, 305 
visual, see Visual perception 
of vowels, 322-335, 339-341 
Percepts, 44, 46, 48—49 
biologically significant, 46 
construction of, 433 
from first person perspective, 108 
in first person plural, 109-110 
with reference to state of agent picking up 
information, 110 
as subjective experience, 110 
Perceptual constancy 
abstract nature of, 339 
normalization hypothesis, 325, 329-330 
and perception of consonants, 339 
and perception of vowels, 335-341 
and phonemic categories, 3 16 
problem in speech perception, 315 
rejection of normalization hypothesis, 338 
Perceptual error, 16-18, 76, 80-82 
Perceptual learning, 80, see also Differentia- 
tion; Education of attention 
Perceptual processing, 46 
and decomposition of information, 49 
as direct detection, 57 
in moving observor, 49 
Perceptual refinement 
appropriate to abstract art, 193: 
as cause of sensibility, 152 
development, 142 
example of, 141 
and sensing painting, 162, 164, 168 
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Perceptual systems, 44,47, 49, see also 
Information, perceptual 
for locomoting, 249-259, 283 
overlap of, 53 
resolution of conflict among, 58 
Perspective in Piaget’s theory, 183-185 
Phenomenal environment, 43, 70 
Phenomenalism, 11,21, 128 
Phenomenology, and perception, 108-109 
Phoneme(s) 
and ambient acoustic array, 317-318 
and categories in speech perception, 316, 
332 
as meaningless units, 316 
and minimal perceptual units, 319 
as nondiscrete unit in sound, 318-319 
and phonemic messages, 324 
and syllables, 339 
Physics 
classical, 6, 21, 24,47, 59, 67, 70, 78 
concept of space in, 106 
ecological physics, 67, 78, 111,375 
innapropriateness of classical for 
perceptual theory, 106 
Newton’s problem, 111 
Pictures, 61-63 
abstract art, 135-169 
caricatures, 105 
line drawings, 147, 162 
plane of projection, 135, 150 
silhouettes, 105 
Point of observation 
in definition of information for vision, 50 
as immersed in array, 51 
moving, 49 
as nodal point of observor’s eye, 51 
as place in environment, 49 
Pragmatism and contextualism, 415 
Prediction, 200-203 
Prism 
adaptation to, 286-287 
Kohler’s studies, 62 
Profile(s), 129 
of heads, 123 
as homeomorphism of cardioid, 123 
and perception of individual identities, 
126-128 
and shear transformations, 124-126 
of skulls, 118 
and strain transformations, 124-129 
Proof(s), 191, 200, 203-204 
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Psycholinguistics, 9, 44, 268 
and research on linguistic form, 365 
Psychophysics, experiments, 44 
in Gibson’s theory, 45-47, 50, 63-64 
and perceptual units, 315 
psychophysical approach vs. ecological 
approach, 47-48, 375 
rejection of traditional theory by Gibson, 
49 
in speech perception, 320 
in Troland’s theory, 44 


a 


Qualities 
primary, 103 
secondary, 103 
Quantum mechanics, 22, 95 


R 


Realism, 11-13 
in art, 151 
constructive, 14-15 
critical, 18-19, 21-23 
cynical, 23 
direct, 11-12, 15, 20-23, 30, 38 
indirect, 11-12, 20-21, 28, 34-36, 38 
naive, 18-21, 23, 104 

Reafference, 248 

Receptive field, 85 

Reference, in affordances, 67, 69-70, 79 
development of capacity to use, 35 
and problem of meaning, 349, 360n., 393, 

398, 401n. 
and referents, 350, 361 
and “sense” (sinn und bedeutung), 398, 
401n. 

Reference system, 183, 185-187 
spatial, 182 

Reflexes, 237, 250, 273-274, see also 

Coordinative structures 

elemental, 249 
as generator for action, 219 
H, 238, 244 
local sign property of, 251 
patellar, 236 
in spinal cord, 231, 234-235 
structures of, 218-219, 223, 233 
tendon, 236 
tuning, 249 
wide ranging vs. local, 223 


Remembering, contextual constraints on, 
443-444 
differentiation, 453 
vs. encoding, 453 
vs. learning and perceiving, 432-433 
and memory metaphor, 439-441 
re-creation metaphor, 449-456 
and thinking, 449 
Resemblance(s) 
discovery of, 379 
in faces, 117, 126-128 
family , 349-350, 359 
gradations of similarity, 370 
and imitative language, 366 
in linguistic forms, 365, 380 
of meanings, 348 
and metaphor, 373 
perception of, 365, 384 
Resonance, 
attunement and set constraints, 450 
and remembering, 450 
Retina, 81, 85,92, 141 
stimulation of, 286 
Retinal image, 11, 13, 24, 26, 28-30, 
34-35, 38, 81, 84, 107 
as ambiguous source of information, 49 
reversal of, 62 
as stimulus for vision, 62 
Reversibility 
in logic, 177 
in thought, 177 
Rods, 44 
for dark vision, 141 


sS 


Sampling of flowing texture, 48 
Scanning 
in art, 137—138, 162-163 
foveal vs. nonfoveal, 149-150 
nonfocused, 150 
Schema in Piaget’s theory, 172-173 
Schemes in Piaget’s theory, 173, 175—176, 
183 
Scission effect, 49 
Semantic(s) 
categories, 420 
developmental theories of, 350-351 
and intentions and speech acts, 370-371 
Katz’s theory of, 368 
markers, 369, 372 
network, 406 


Semantic(s) (contd.) 
properties, 406, 428 
regress, 8-9 
standard “readings” in, 370 
Semantic memory 
categories in, 396-399, 407-408 
content addressed, 400 
in contextualist experiments, 421-424 
definition of, 393 
information for, 399 
retrieval hypothesis of, 394-396 
storage hypothesis of, 400-401 
Sensation(s), 44, 79, 81, 269 
in art 
of balance, 136 
of direction, 136 
of mass, 136 
and critical claims in art, 143 
enrichment of, 433 
meaning of, 434 
relation to perception, 58, 284 
theories based on, 56-58, 79, 268-272 
Sense data theory, 11 
Senses 
as perceptual systems, 58-59 
as separate channels, 58 
Sensible intuition, 14 
in art, 138 
Sensibility 
and appreciation of art, 162-166 
and attention in perception of paintings, 
135-169 
and critical claim, 143 
as differing from visual acts, 147 
to elements of art, 158-163 
and elements of painting, 147 
partiality of, 137 
and perception, 155-158 
and philosophy of mind, 166 
as visual act, 142 
Sensorimotor, 175, 218 
as dual, 259 
stage in Piaget’s theory, 173, 206, 358 
Sensory conception of mind rejected, 
268-269, 273, 279, 294, 299, 300 
Sensory mode and conventionality of sensory- 
motor distinction, 61, 211-212, 259 
Sensory modalities, 290-291 
Sensory nerves, 274 
Sensory qualities hypothesis, 290-291, 305 
Sensory systems, 211-212, 270, 276 
as motoric, 283, 287 
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Shadows, changing pattern of related to 
slant, 54-55 
Shape, 
changes in by shear transformation, 
124-129 
by strain transformation, 125-126 
classification of in development, 353 
concept of in research, 106, 113 
definition through hierarchy of 
geometries, 113-116 
equivalence of, 116 
Euclidean description of, 103-104 
of face through growth, 117-124 
information for, 115 
as primary quality, 103 
rigidity in object perception, 104 
topological invariants for, 123 
Shear transformation, 
in affine geometry, 114-115, 124 
in ecological information for age, 
124-126 
as global, 120-121 
in growth, 119 
Sign(s) 
definition of, 275 
and signification, 276, 302 
Similarity vs. specificity of features in 
ambient array, 51 
Similes 
and analogies, 382 
in language, 365 
Simplicity, 171, 198, 203, 206 
Skill, see Action 
Space 
concept of 
Cartesian, 121 
child’s, 107 
Einsteinian, 106, 111 
Newton’s, 106, 111 
Riemann’s, 106, 111, 117-130 
as ecological information, 110 
global structure of, 111 
and hierarchy of geometries, 113-116, 
130 ; 
ontogenesis, 107 
in paintings, due to pictorial “cues,” 136 
topological and growth, 118-1 19, 123, 
126-129 
Specific energies of nerves hypothesis, 291 
Specious present, 416 
Speech aquisition, assumptions in theory of, 
317 
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Speech perception 
action-based theory, 258-259, 284, 289 
adaptation level hypothesis, 326-327 
classification in, 315, 320 
direct, 339n. 
events in, 416 
by machines, 316-317 
misperception of vowels in, 326-331 
normalization hypothesis, 325 
percentage error in due to uncertainty 
about speaker, 328 
and psychology of perception, 315, 
335-341 
and recognition, 317-318, 321, 335-341 
and resemblance, 365 
role of syllable in, 331-341 
and segmentation of acoustic array, 321, 
335 
and specification of vowels, 324-328, 
331-332 
vocal tract theories of, 258-259, 331, 
333, 336, 338-341 
Spinal cord, 218, 223, 230, 235, 238-239 
as active, 231, 239 
connection to visual system, 252-259 
as segmental subsystem, 232-235 
and supraspinal functioning, 253 
tuning of, 243-244, 250 
Split-brain phenomena, 255, 293 
Stage setting metaphor, 
and active relationships, 438 
and general attunement constraints, 
441-444 
and linguistic comprehension, 436-438 
vs. memory metaphor, 434 
and recognition, 435 
recreation vs. search, 440 
Stimulus, 44, 80 
as building block for perception, 49 
to elicit behavior, 49 
environment dependent, 45 
in Gestalt theory, 63-64 
nonoptical properties of, 58-59, 61 
patterning, 44, 58 
structure of, 45, 56-57 
variables of stimulation, 56 
Stimulus-response psychology, 46, 268, 
271 
Strain transformation, 
in affine geometry, 114-115 
cardioidal, 118-125 
coefficient of vis a vis cardioidal, 122 


Strain transformation (contd.) 
ecological information for individual 
identity, 126-128 
as global, 120-121 
homeomorphisms of, 122 
as modeling growth, 119-121, 126-128 
nontopological, 123 
Structural invariant, see Symmetry 
Structuralism, 1 
Structure, 
of abstract, in art, 159 
in acoustic array, 285 
in array, in spatiotemporal relations, 56 
coalitional, 223-225, 246-248, 307-308 
cognitive, 173, 181, 183, 270 
cognitive development, 353 
cognitive linguistic, 348, 352-358, 362 
correspondence to affordances, 60 
defined in terms of units, 51 
in environment, 45 
exteroceptive, 52 
exterospecific, 52 
in group theory, 259 
and growth space, 118-125 
as independent of observor, 366 
in light, 44 
in memory, 406 
of objects with regard to space, 106, 
129-131 
optical, 49, 61 
propriospecific, 52 
and Riemannian space, 125 
and shape, 113-116 
of space, 106, 111-112, 130 
of space by properties of the central 
nervous system, 107 
of stimulus, 45-47 
Surfaces, 44-45, 48, 67, 70-74, 76, 80 
affording support, 60 
slant of, 44,48 
substantiality and rigidity of, 51 
in visually controlled locomotion, 44, 
246-248 2 
Surface structure, 213-214 
cortical representation of action plans as, 
229 
as genera, 279 
and grammar, 367 
Syllable(s) 
centers, 338 
as containing sufficient information for 
recognition of phonemes, 321, 329-331 


Syllable(s) (contd.) 

disambiguation of by vowels, 327 

embedding of vowels in, 325, 331 

information in, 326 

as nondiscrete, 335 

perception of randomly ordered, 337 

precursors as producers of error, 329-330 

in speech perception, 319, 326, 328 

and variations in utterences, 323, 

325-326 

and vowel detection, 325-326, 328-329 
Symbol 

and arbitrariness of reference, 393 

definition of, 275 

in language, 365, 377 

in memory, 383-384 

and symbolization, 276, 302 
Symmetry, 96, 229 

of body, 254 

breaking operations, 114, 116 

in expanding flow of optic array, 248 

for facial growth, 123-124 

of growth spaces, 119 

spaces, 187 

as structural invariant, 113 

theory and geometry, 116 
Symmetry operation(s) 

definition of, 113 

in hierarchy of geometries, 114-116 

topological homeomorphism of, 118 
Synchronic, organization in linguistic 

meaning, 347-352 

System 

autonomous, 239 

coalitional, 307-308 

closed, 199, 205 

open, 199-200, 205 

vibratory, 242-244 


T 


Tacit knowledge, 281, see also Unconscious 
and meanings, 296, 372 
and mind-body problem, 294 
Terrain, 45, 72-73, 82, 246-248 
Texture, 81, 171 
in ambient optic array, 248 
appearance and disappearance in Kaplan’s 
work, 55 
flow, 44, 48—49 
gradients, 44—46, 249 
gradient and correspondence to slant, 54 


SUBJECT INDEX 491 


Texture (contd.) 
gradient in paintings, 139 
as instance of structure in an array, 56 
as relations in events, 416, 421—423 
surface, 150 
Thought 
compensatory property of, 177 
and cognition, 277 
figural vs. symbolic, 377-378, 381-382 
hypothesis of sensations in, 269 
organization in, 347 
and Piaget’s preoperatory stage, 178 
as skill, 304 
Topology, 
and conception of shape, 116-124 
in hierarchy of geometries, 113, 116 
Trajectory of joint-muscle system, 230, 235 
Transcendent formism, Pepper’s definition 
of, 367 
Transformation, 
appropriate for acoustical invariants, 336 
by displacement, 113 
of domains, 386 
global, 48, 120-121 
in group theory, 113,279 
nonrigid, 105, 117 
of optic array, 246-248, 252-253 
of similitude, 114, 117 
Transformational grammar theory and 
traditional linguistic assumptions, 367 
Transformational invariant, see Symmetry 
operation 
Transformations 
groups of 
Affine, 114-115, 121-124 
ratio of division, 115 
shear, 114-115, 124 
shear and perception of facial ~7"~> 
124-126 
strain, 114-115, 121 
Euclidean, 113-114, 183 
radial or homethetic, 114, 117 
ratio of similitude, 114 
Projective, 121-124 
an harmonic or cross ratio, 115 
as nonmetric, 130 
preservation of linearity (incidence), 
117, 123 
strain in, 122-123 
Topological, 75, 121-124, 177-178, 
185-187, 228 
cardioidal, 118 
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Transformations 
Topological (contd. ) 
continuity in point-set topology, 116 
in growth space, cardioidal strains, 
118-129 
homeomorphisms defined, 116-117 
homeomorphism in growth space, 119 
as vs. projective and affine, 122-124 
relevance to perceptual theory, 117 
Transitive inference, 175-176 
Tuning 
as accurate motor displacement, 230, 233 
of actions by ecological information, 
251-259 
and pretuning, 235 
specific to agonist activity, 238 
not specified by action plan, 245 
of skilled behavior, 240 
Turing machines, 197, 200, see also 
Abstract automata 
Typology and essentialism in linguistic 
taxonomy, 367-370 


U 


Unconscious, see also Tacit knowledge 
in commissurotomized patients, 294 
in sensory theory, 280, 301 
as subconscious, 280, 301 
theories of, 280 


v 


Valences, 78 
Value, see Affordances and Meaning 
Velocity in motor functioning, 229-230, 
235 
Viscal, see also Elastic 
Viscal component of growth, 116-117 
Viscal constraints in growth, 119 
Viscal, defined, 123-124 
Viscal events, 129 
Vision, see also Perception and Visual 
system 
averted, 149, 152-153, 155-156, 165, 
168 
foveal vs. nonfoveal, 149-150, 152-153 


Vision (contd.) 
as motor:system, 286 
passive reception hypothesis, 268 
Visual act(s), and attention, 142 
identification, 157-158 
recognition, 157-158 
Visual attention 
analytic, 152, 163-166 
to color, 145-146, 159 
and contextualism in art, 139 
definition of, 140-141 
eyeside aspects, 145-146 
learning of, 141 
means of directing, 140-141 
vs. noticing, 154 
and ostention, 143 
and pointing, 144-145 
synoptic, 152, 163-166 
variability of, 150 
Visual cliff, 80 
Visual cues, 
conflict among in art, 138-139 
as information in visual world, 138 
misleading, 147 
in paintings, 137, 140 
Visual field, 48 
Visual kinesthesis, 52 
Visual perception, see Visual system 
Visual solid angle, 50, 73 
Visual system, 43, 80, 85 
connection to spinal cord, 252-253 
in locomotion, 246-248 
and visual perception, 48, 211 
and visual perception of aging, 124-131 
and visual perception of form, 129-131, 
286 
and visual perception as motoric, 286-289 
related to action, 254 
in tuning of motor activity, 257-259 


w 


Word association(s), 
hypothesis, rejection of, 414-415 
Jenkins’ and Russell’s research, 413 
tests, 413 
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